CKA 考試題目

1. Node list

Get the list of nodes in JSON format and store it in a file at /opt/output/nodes-z3444kd9.json

k get nodes -o json > /opt/output/nodes-z3444kd9.json

2. service

create a service messaging-service to expose the messaging aplication within the cluster on port 6379

k get pods

k get svc

k expose pod messaging —port=6379 — name messaging-service

3. deployment

create a deployment named hr-web-app using the image koddklou/webapp-color with 2 replicas

k create deployment hr-web-app —image=koddklou/webapp-color —replicas=2

4. static pod

create a static pod named static-busybox on the controlplane that uses the busybox image and the commmand sleep 1000

k run static-busybox --image=busybox —-dry-run=client -o static-busybox.yaml —command —sleep 1000

cat static-busybox.yaml

mv static-busybox.yaml /etc/kubetnetes/manifests/

5. namespace

create a pod in the finance namespace named temp-bus with the image redis:alpine

kubectl create namespace finanace

k run temp-bus --image=edis:alpine -namespace finance

6. Nodeport

Expose the hr-web-app as service hr-web-app-service application on port 30082 on the node on the cluster

The web application listens on port 8080

k get deply

k expose deploy hr-web-app —name=hr-web-app-service —type Nodeport —port 8080

k desribe svc hr-web-app-service

k edit svc hr-web-app-service

7. 取得 NodeInfo

use JSON PATH query to retrieve the osImages of all the nodes and store it in a file /out/outputs/nodes_os_x43kj56.txt

k get nodes -o json

k get nodes -o jsonpath='{item[*].status.nodeInfo.osImage}' > /out/outputs/nodes_os_x43kj56.txt

8. Pod Scheduling 1

Schedule a Pod as follows:

Name:nginx-kusc00101
Image:nginx
Nodeselector:disk=ssd

kubectl run web --image nginx --dry-run=client -o yaml > nginx.yaml

apiVersion: v1
kind: Pod
metadata:
	creationTimestamp: null
	labels:
		disk: ssd
		run: web
		name: web
spec:
	containers:
	- image: nginx
		name: web
		resources: {}
		dnsPolicy: ClusterFirst
		restartPolicy: Always
status: {}

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: nginx-kusc00101
  name: nginx-kusc00101
spec:
  containers:
  - image: nginx
    name: nginx-kusc00101
    resources: {}
	nodeSelector:
    disk: ssd
  dnsPolicy: ClusterFirst
  restartPolicy: Never
status: {}

kubectl apply -f nginx.yaml

9. Deployment Rolling Update

Create a deployment as follows:

Name：nginx-app

Using container nginx with version 1.11.0-alpine

The deployment should contain 3 replicas

Next, deploy the app with new version 1.11.3-alpine by performing a rolling update and record that update.

Finally, rollback that update to the previous version 1.11.0-alpine

kubectl create deploy nginx-app --image=nginx:1.11.0-alpine

kubectl rollout history deploy nginx-app

kubectl get deploy -o wide

kubectl rollout undo deployment nginx-app

kubectl get deploy -o wide

kubectl rollout status deployment nginx-app

10. Secret

Create a kubetnetes Secret as follows:

Name: super-secret

Credential: alice or username:bob

Create a Pod named pod-secrets-via-file using the redis image which mounts a secret named super-secret at /secrets

Create a second Pod named pod-secrets-via-env using the redis image,which exports credential/username as TOPSECRET/CREDENTIALS

k create secret generic super-secret --from-literal=credentail=alice --from-literal=username=bob

k run pod-secrets-via-file --image=redis --dry-run=client -o yaml > q5-1-pod.yaml

vim q5-1-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: pod-secrets-via-file
  name: pod-secrets-via-file
spec:
  containers:
  - image: redis
    name: pod-secrets-via-file
    resources: {}
    volumeMounts:
    - name: foo
      mountPath: /secrets
  volumes:
  - name: foo
    secret:
      secretName: super-secret
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

k run pod-secrets-via-env --image=redis --dry-run=client -o yaml > q5-2-pod.yaml

vim q5-2-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: pod-secrets-via-env
  name: pod-secrets-via-env
spec:
  containers:
  - image: redis
    name: pod-secrets-via-env
    resources: {}
    env:
    - name: TOPSECRET
      valueFrom:
        secretKeyRef:
          name: super-secret
          key: credential
    - name: CREDENTIALS
      valueFrom:
        secretKeyRef:
          name: super-secret
          key: username
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

kubectl apply -f q5-1-pod.yaml q5-2-pod.yaml

11. etcd backup

Take a backup of the etcd cluster and save it to /tmp/etcd-backup.db

apt install etcd-client
ETCDCTL_API=3 etcdctl version
cd /etc/kubernetes/manifests
cat etcd.yaml
...
## 找到這些檔案的路徑，作為之後參數使用
    - --advertise-client-urls=https://172.17.0.14:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    ...
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    ...
...

ETCDCTL_API=3 etcdctl member list --endpoints https://127.0.0.1:2379 \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--cacert=/etc/kubernetes/pki/etcd/ca.crt

ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db \
--endpoints https://127.0.0.1:2379 \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--cacert=/etc/kubernetes/pki/etcd/ca.crt

ETCDCTL_API=3 etcdctl snapshot status /tmp/etcd-backup.db -w table \
--endpoints https://127.0.0.1:2379 \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--cacert=/etc/kubernetes/pki/etcd/ca.crt

+----------+----------+------------+------------+
|   HASH   | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| 24b6552d |  5488124 |       1113 |     4.6 MB |
+----------+----------+------------+------------+

12. Volume

Create a Pod called redis-storage with image: redis:alpine with a Volume of type emptyDir that lasts for the life of the Pod. Specs on the right

Pod named ‘redis-storage’ created
Pod ‘redis-storage’ uses Volume type of emptyDir
Pod ‘redis-storage’ uses volumeMount with mountPath = /data/redis

 kubectl run redis-storage --image=redis:alpine --restart=Never --dry-run=client -o yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: redis-storage
  name: redis-storage
spec:
  containers:
  - image: redis:alpine
    name: redis-storage
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Never
  volumes:
  - name: cache-volume
    emptyDir:
      sizeLimit: {}
  volumes:
  - name: test-volume
    hostPath:
      path: /data/redis
status: {}

13. Security Context for a Pod

Create a new Pod called super-user-pod with image busybox:1.28. Allow the pod to be able to set system_time

The container should sleep for 4800 seconds

Pod: super-user-pod

Container Image: busybox:1.28

SYS_TIME capabilities for the conatiner?

apiVersion: v1
kind: Pod
metadata: 
    name: super-user-pod
spec: 
    containers:
    - image: busybox:1.28
      name: super-user-pod
      ## 加上securityContext參數
      securityContext:
        capabilities:
        ## 允許設定SYS_TIME
          add: ["SYS_TIME"]
      ## container sleep 4800
      command: ["sleep"]
      args: ["4800"]
    restartPolicy: Never

14. PV & PVC

A pod definition file is created at q9-pod.yaml. Make use of this manifest file and mount the persistent volume called pv-1. Ensure the pod is running and the PV is bound.

mountPath: /data

persistentVolumeClaim Name: my-pvc

persistentVolume Claim configured correctly
pod using the correct mountPath
pod using the persistent volume claim?

kubectl get pv
NAME   CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
pv-1   10Mi       RWO            Retain           Available                                   15m

vim q9-pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Mi

kubectl apply -f q9-pvc.yaml

kubectl get pv
NAME   CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            STORAGECLASS   REASON   AGE
pv-1   10Mi       RWO            Retain           Bound    default/my-pvc

14. Pod scheduling

Taint the worker node g8node1 to be Unschedulable. Once done, create a pod called dev-redis, image redis:alpine to ensure workloads are not scheduled to this worker node. Finally, create a new pod called prod-redis and image redis:alpine with toleration to be scheduled on g8node1.

key: env_type
value: production
operator: Equal
effect: NoSchedule

kubectl taint nodes g8node1 env_type=production:NoSchedule

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: dev-redis
  name: dev-redis
spec:
  containers:
  - image: redis:alpine
    name: dev-redis-container
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Never
status: {}
tolerations:
- key: "env_type"
  operator: "Equal"
  value: "production"
  effect: "NoSchedule"

15. Init Pod

Create a pod as follow：

name：my-nginx

image： nginx

Add an Init Container within the Pod, the role of Init Container is to create an empty file under /cache/test.txt, Pod Containers determine whether the file exists, exiting if it does not exist.

apiVersion: v1
kind: Pod 
metadata:
  name: my-nginx
spec:
  containers:
  - name: my-nginx
    image: nginx
    imagePullPolicy: IfNotPresent
    command: ["/bin/sh"]
    args: ["-c", "cat /cache/test.txt && sleep 3600000"]
    volumeMounts:
    - mountPath: /cache
      name: cache-volume
  initContainers:
  - name: init-nginx
    image: busybox:1.28
    imagePullPolicy: IfNotPresent
    command: ['touch', '/cache/test.txt']
    volumeMounts:
    - mountPath: /cache
      name: cache-volume
  volumes:
  - name: cache-volume
    emptyDir: {}

16. JSON file

Use JSON PATH query to retrieve the architecture of all the nodes and store it in a file /opt/outputs/nodes_architecture.txt

The architecture are under the nodeInfo section under status of each node.

$ kubectl get nodes -o jsonpath='{.items[*].status.nodeInfo.architecture}' > /opt/outputs/nodes_architecture.txt

$ cat /opt/outputs/nodes_architecture.txt
amd64 amd64 amd64

17. ConfigMap

Create a configmap called cfgvolume with values var1=val1, var2=val2 and create an nginx pod with volume nginx-volume which reads data from this configmap cfgvolume and put it on the path /etc/cfg

apiVersion: v1
kind: ConfigMap
metadata:  
	name: appconfig
	data:  
		var1: var1
		var2: var2

kubectl create configmap mysql-cfg --from-file=appconfig

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
volumes:
  - name: appconfig
    configMap:
      name: appconfig
  containers:
    - name: nginx container
      image: nginx
      volumeMounts:
    - name: appconfig
      mountPath: "/etc/cfg"

kubectl apply -f nginx-configmap-pod.yaml

kubectl exec -it nginx -- /bin/sh

cd /etc/cfg

18. Troubleshoot

The same 2 tier application is deployed in the test namespace. It must display a right web page on success. It is currently failed. Troubleshoot and fix the issue.

kubectl get all -n test
NAME                                READY   STATUS    RESTARTS   AGE
pod/mysql                           1/1     Running   0          7m30s
pod/webapp-mysql-5fb9ccd54d-9fv25   1/1     Running   0          7m29s

NAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
service/mysql-service   ClusterIP   10.111.102.167   <none>        3306/TCP         7m29s
service/web-service     NodePort    10.104.81.123    <none>        8080:30088/TCP   7m29s

NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/webapp-mysql   1/1     1            1           7m29s

NAME                                      DESIRED   CURRENT   READY   AGE
replicaset.apps/webapp-mysql-5fb9ccd54d   1         1         1       7m29s

kubectl edit -n test po mysql

...
  ports:
  - containerPort: 3306
    protocol: TCP
...

kubectl edit -n test po webapp-mysql-5fb9ccd54d-9fv25

...
  ports:
  - containerPort: 8080
    protocol: TCP
...

kubectl edit -n test svc mysql-service
...
spec:
  clusterIP: 10.111.102.167
  ports:
  - port: 3306
    protocol: TCP
    targetPort: 3306
...

kubectl edit -n test svc web-service
...
spec:
  clusterIP: 10.104.81.123
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
    nodePort: 30088
...

nodePort30088 → 30081

kubectl edit -n test svc web-service
...
spec:
  clusterIP: 10.104.81.123
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
    nodePort: 30081
...

kubectl describe po -n test | grep -i label
Labels:     name=mysql
Labels:     name=webapp-mysql

kubectl describe svc -n test | grep -i selector
Selector:         name: mysql
Selector:         name: webapp-mysql

再來檢查 Container 的環境變數：帳號密碼是否和題目相符

因為 Container 是由 Deployment 創建的，因此我們檢查 Deployment 的設定

kubectl -n test edit deploy webapp-mysql
...
spec:
  containers:
  - env:
    - name: DB_Host
      value: mysql-service
    - name: DB_User
      value: sql-user
    - name: DB_Password
      value: paswrd
...

這邊讓我們找到第二個 bug了，那就是 DB_User: sql-user 的設定錯誤。我們將DB_User 改為 root ：

kubectl -n test edit deploy webapp-mysql

...

spec:
  containers:
  - env:
    - name: DB_Host
      value: mysql-service
    - name: DB_User
      value: root
    - name: DB_Password
      value: paswrd
...

19. Troubleshooting

The cluster is broken. Something is wrong with scaling Pods. We just tried scaling the deployment to 2 replicas. But it's not happening. Troubleshoot and fix the issue.

kubectl get po -n kube-system
NAME                                      READY   STATUS             RESTARTS   AGE
...
kube-apiserver-g8master                     1/1     Running            1          6m7s
kube-controller-manager-g8master            0/1     CrashLoopBackOff   4          2m34s
kube-proxy-h2smv                          1/1     Running            0          5m49s
...

kubectl describe
Events:
  Type     Reason   Age                     From             Message
  ----     ------   ----                    ----             -------
  Normal   Pulled   7m12s (x5 over 8m41s)   kubelet, master  Container image "k8s.gcr.io/kube-controller-manager:v1.18.0" already present on machine
  Normal   Created  7m12s (x5 over 8m41s)   kubelet, master  Created container kube-controller-manager
  Normal   Started  7m12s (x5 over 8m41s)   kubelet, master  Started container kube-controller-manager
  Warning  BackOff  3m33s (x27 over 8m37s)  kubelet, master  Back-off restarting failed container

kubectl logs -n kube-system kube-controller-manager-g8master
I0910 08:22:45.990175       1 serving.go:313] Generated self-signed cert in-memory
unable to load client CA file "/etc/kubernetes/pki/ca.crt": open /etc/kubernetes/pki/ca.crt: no such file or directory

找到問題囉，看起來是認證的 ca.crt 檔案lost，看一下該目錄下是否有這個檔案

$ ls /etc/kubernetes/pki/
apiserver-etcd-client.crt  apiserver-kubelet-client.crt  apiserver.crt  ca.crt  etcd                front-proxy-ca.key      front-proxy-client.key  sa.pub
apiserver-etcd-client.key  apiserver-kubelet-client.key  apiserver.key  ca.key  front-proxy-ca.crt  front-proxy-client.crt  sa.key

$ vim /etc/kubernetes/manifests/kube-controller-manager.yaml

...

    volumeMounts:
    ...
## 找到volumeMount的名稱    - mountPath: /etc/kubernetes/pki
      name: k8s-certs
      readOnly: true
      ...
    ...

## 往下找名為k8s-certs的volumes的hostPath  volumes:
  ...
  - hostPath:
    name: k8s-certs
      path: /etc/kubernetes/aaaaa
      type: kube-controller-manager.yaml
  ...
...

$ vim /etc/kubernetes/manifests/kube-controller-manager.yaml

...
  volumes:
  ...
  - hostPath:
    name: k8s-certs
      path: /etc/kubernetes/pki
      type: kube-controller-manager.yaml
  ...
...

20. Credentials

Server private key location: /etc/kubernetes/pki/etcd/server.key Server certificate expiration date: Sep 13 13:01:31 2022 GMT Is client certificate authentication enabled: yes

Use context: kubectl config use-context k8s-c2-AC

The cluster admin asked you to find out the following information about etcd running on cluster2-controlplane1:

Server private key location
Server certificate expiration date
Is client certificate authentication enabled

Write these information into /opt/course/p1/etcd-info.txt

Finally you're asked to save an etcd snapshot at /etc/etcd-snapshot.db on cluster2-controlplane1 and display its status.

k get node

kubectl -n kube-system get pod

find /etc/kubernetes/manifests/

vim /etc/kubernetes/manifests/etcd.yaml

openssl x509  -noout -text -in /etc/kubernetes/pki/etcd/server.crt | grep Validity -A

Server private key location: /etc/kubernetes/pki/etcd/server.key
Server certificate expiration date: Sep 13 13:01:31 2022 GMT
Is client certificate authentication enabled: yes

21. user context

Use context: kubectl config use-context k8s-c1-H

You're asked to confirm that kube-proxy is running correctly on all nodes. For this perform the following in Namespace project-hamster:

Create a new Pod named p2-pod with two containers, one of image nginx:1.21.3-alpine and one of image busybox:1.31. Make sure the busybox container keeps running for some time.

Create a new Service named p2-service which exposes that Pod internally in the cluster on port 3000->80.

Find the kube-proxy container on all nodes cluster1-controlplane1, cluster1-node1 and cluster1-node2 and make sure that it's using iptables. Use command crictl for this.

Write the iptables rules of all nodes belonging the created Service p2-service into file /opt/course/p2/iptables.txt.

Finally delete the Service and confirm that the iptables rules are gone from all nodes.

k get node

kubectl run po p2-pod --image=nginx:1.21.3-alpine --restart=Never --dry-run=client

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: p2-pod
  name: p2-pod
	namespace: project-hamster
spec:
  containers:
  - image: nginx:1.21.3-alpine
    name: nginx
	containers:
  - image: busybox:1.31
		name: busybox
		command: ["sh", "-c", "sleep 1d"]
  restartPolicy: Never
status: {}

k -f p2.yaml apply

apiVersion: v1
kind: Service
metadata:
  name: p2-service
spec:
  selector:
    app.kubernetes.io/name: MyApp
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000

k -n project-hamster expose pod p2-pod --name p2-service --port 3000 --target-port 80

k -n project-hamster get pod,svc,ep

k get node

ssh cluster1-controlplane1

crictl ps | grep kube-proxy

crictl logs 27b6a18c0f89cI0913

12:53:03.096620       
1 server_others.go:212] Using iptables Proxier.
➜ ssh cluster1-controlplane1 iptables-save | grep p2-service >> /opt/course/p2/iptables.txt
➜ ssh cluster1-node1 iptables-save | grep p2-service >> /opt/course/p2/iptables.txt
➜ ssh cluster1-node2 iptables-save | grep p2-service >> /opt/course/p2/iptables.txtk -n project-hamster delete svc p2-service
➜ ssh cluster1-controlplane1 iptables-save | grep p2-service➜ ssh cluster1-node1 iptables-save | grep p2-service
➜ ssh cluster1-node2 iptables-save | grep p2-service

22. role

create a new user called john. grant him access to the cluster John should have permission to create, list, get, update and delete pods in the development namespace. the private key exists in the location: /root/CKA/john.key and csr at /root/CKA/john.csr.

Important Note: as of Kubernetes 1.19, the CertificateSigningRequest object expects a signedName.

Please refer the documentation to see an example. The documentation tab is available at the top right of terminal

cat john.csr | base64 | tr -d “\n”

vi john-csr.yaml

apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
  name: john-developer 
spec:
  request: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlJQ1ZqQ0NBVDRDQVFBd0VURVBNQTBHQTFVRUF3d0dZVzVuWld4aE1JSUJJakFOQmdrcWhraUc5dzBCQVFFRgpBQU9DQVE4QU1JSUJDZ0tDQVFFQTByczhJTHRHdTYxakx2dHhWTTJSVlRWMDNHWlJTWWw0dWluVWo4RElaWjBOCnR2MUZtRVFSd3VoaUZsOFEzcWl0Qm0wMUFSMkNJVXBGd2ZzSjZ4MXF3ckJzVkhZbGlBNVhwRVpZM3ExcGswSDQKM3Z3aGJlK1o2MVNrVHF5SVBYUUwrTWM5T1Nsbm0xb0R2N0NtSkZNMUlMRVI3QTVGZnZKOEdFRjJ6dHBoaUlFMwpub1dtdHNZb3JuT2wzc2lHQ2ZGZzR4Zmd4eW8ybmlneFNVekl1bXNnVm9PM2ttT0x1RVF6cXpkakJ3TFJXbWlECklmMXBMWnoyalVnald4UkhCM1gyWnVVV1d1T09PZnpXM01LaE8ybHEvZi9DdS8wYk83c0x0MCt3U2ZMSU91TFcKcW90blZtRmxMMytqTy82WDNDKzBERHk5aUtwbXJjVDBnWGZLemE1dHJRSURBUUFCb0FBd0RRWUpLb1pJaHZjTgpBUUVMQlFBRGdnRUJBR05WdmVIOGR4ZzNvK21VeVRkbmFjVmQ1N24zSkExdnZEU1JWREkyQTZ1eXN3ZFp1L1BVCkkwZXpZWFV0RVNnSk1IRmQycVVNMjNuNVJsSXJ3R0xuUXFISUh5VStWWHhsdnZsRnpNOVpEWllSTmU3QlJvYXgKQVlEdUI5STZXT3FYbkFvczFqRmxNUG5NbFpqdU5kSGxpT1BjTU1oNndLaTZzZFhpVStHYTJ2RUVLY01jSVUyRgpvU2djUWdMYTk0aEpacGk3ZnNMdm1OQUxoT045UHdNMGM1dVJVejV4T0dGMUtCbWRSeEgvbUNOS2JKYjFRQm1HCkkwYitEUEdaTktXTU0xMzhIQXdoV0tkNjVoVHdYOWl4V3ZHMkh4TG1WQzg0L1BHT0tWQW9FNkpsYWFHdTlQVmkKdjlOSjVaZlZrcXdCd0hKbzZXdk9xVlA3SVFjZmg3d0drWm89Ci0tLS0tRU5EIENFUlRJRklDQVRFIFJFUVVFU1QtLS0tLQo=
  signerName: kubernetes.io/kube-apiserver-client
  expirationSeconds: 86400  # one day
  usages:
  - client auth

k get csr

k certificate approve john-developer

k create role developer —verb=create, list, get, update, delete —resource=pods -n development

k create rolebinding john-developer --role=developer --user=john

23. DNS

create a nginx pod called nginx-resolver using image nginx, expose it internally with a server called nginx-resolver-service.

Test that you are able to look up the service and pod names from within the cluster, Use the Image busybox:1.28 to create a pod for dns lookup. Record results in /root/CKA/nginx.svc and /root/CKA/nginx.pod for service and pod name resolutions respectively

k run nginx-resolver —image=nginx

k get pods

k expose pod nginx-resolver --name=nginx-resolver-service --port=80

k run busybox —image=busybox:1.28 --sleep 4000

k get pods

k exec busybox -- nslookup nginx-resolver-service > /root/CKA/nginx.svc

k exec busybox -- nslookup 10-50-192-4.default.pod.cluster.local > /root/CKA/nginx.pod 

24. ServiceAccount

create a new service account with the name pvviewer. Grant this Service account access to list all PersistentVolumes in the cluster by creating an appropriate cluster role called pvviewer-role and ClusterRoleBinding called pvviewer-role-binding.

Next, create a pod called pvviewer with the Image: redis anad serviceAccount: pvvieweer in the default namespace.

ServiceAccount: pvviewer
ClusterRole: pvviewer-role
ClusterRoleBinding: pvviewer-role-binding
Pod: pvviewer
Pod configured to use ServiceAccount pvviewer

kubectl create serviceaccount pvviewer

k get sa

k create clusterrole --verb=list --resource=persistvolumes

k get clusterrole

k create clusterrolebinding

k create clusterrolebinding pvviewer-role-binding --clusterrole=pvviewer-role --servcie account=default:pvviewer  

k run pvviewer --image=redis --dry-run -0 yaml > pvviewer.yaml

vi pvviewer.yaml

k apply -f pvviewer.yaml  

25. JSONPATH

List the InternalP of all nodes of the cluster. Save the result to a file /root/CKA/node_ips.

k get nodes -o jsonpath=’{.item[*].status.addresses[?(@.type==”InternalIP”)].address}’ > /root/CKA/node_ips

26. Environment Variables

create a pod called multi-pod with two containers.

Container 1, name: alpha, Image: nginx
Container 2, name beta, Image: busybox, command: sleep 4800

ENvironment Variables:

container 1:
name: alpha
container 2:
name: beta

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: multi-pod
  name: multi-pod
spec:
  containers:
  - image: nginx
    name: nginx
		env:
			- name: "name"
				value: "alpha"
	container:
	- image: busybox
		name: beta
		command:
    - sleep
		- "48000"
		env:
		- name: "name"
			value: "beta"
		resources: {}
	dnsPolicy: ClusterFirst
	restartPOlicy: Always
status: {} 

27. Security Context

create a pod called non-root-pod .image redis:alpine

runAsUser: 1000
fsGroup: 2000

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: non-root-pod
  name: non-root-pod
spec:
	securitContext:
		runAsUser: 1000
		fsGroup: 2000
  containers:
  - image: redis:alpine 
    name: redis
		resources: {}
	dnsPolicy: ClusterFirst
	restartPOlicy: Always
status: {} 

28. Contexts

You have access to multiple clusters from your main terminal through kubectl contexts. Write all those context names into /opt/course/1/contexts.

Next write a command to display the current context into /opt/course/1/context_default_kubectl.sh, the command should use kubectl.

Finally write a second command doing the same thing into /opt/course/1/context_default_no_kubectl.sh, but without the use of kubectl.

Maybe the fastest way is just to run:

k config get-contexts # copy manually
k config get-contexts -o name > /opt/course/1/contexts

Or using jsonpath:

k config view -o yaml # overview
k config view -o jsonpath="{.contexts[*].name}"
k config view -o jsonpath="{.contexts[*].name}" | tr " " "\n" # new lines
k config view -o jsonpath="{.contexts[*].name}" | tr " " "\n" > /opt/course/1/contexts

The content should then look like:

# /opt/course/1/contexts
k8s-c1-H
k8s-c2-AC
k8s-c3-CCC

Next create the first command:

# /opt/course/1/context_default_kubectl.sh
kubectl config current-context

➜ sh /opt/course/1/context_default_kubectl.sh
k8s-c1-H

And the second one:

# /opt/course/1/context_default_no_kubectl.sh
cat ~/.kube/config | grep current

➜ sh /opt/course/1/context_default_no_kubectl.sh
current-context: k8s-c1-H

In the real exam you might need to filter and find information from bigger lists of resources, hence knowing a little jsonpath and simple bash filtering will be helpful.

The second command could also be improved to:

# /opt/course/1/context_default_no_kubectl.sh
cat ~/.kube/config | grep current | sed -e "s/current-context: //"

29. Service

Use context: kubectl config use-context k8s-c2-AC

Create a Pod named check-ip in Namespace default using image httpd:2.4.41-alpine. Expose it on port 80 as a ClusterIP Service named check-ip-service. Remember/output the IP of that Service.

Change the Service CIDR to 11.96.0.0/12 for the cluster.

Then create a second Service named check-ip-service2 pointing to the same Pod to check if your settings did take effect. Finally check if the IP of the first Service has changed.

Let's create the Pod and expose it:

k run check-ip --image=httpd:2.4.41-alpine

k expose pod check-ip --name check-ip-service --port 80

And check the Pod and Service ips:

➜ k get svc,ep -l run=check-ip
NAME                       TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
service/check-ip-service   ClusterIP   10.104.3.45   <none>        80/TCP    8s

NAME                         ENDPOINTS      AGE
endpoints/check-ip-service   10.44.0.3:80   7s

Now we change the Service CIDR on the kube-apiserver

➜ ssh cluster2-controlplane1
➜ root@cluster2-controlplane1:~# vim /etc/kubernetes/manifests/kube-apiserver.yaml

# /etc/kubernetes/manifests/kube-apiserver.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-apiserver
    tier: control-plane
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --advertise-address=192.168.100.21
...
    - --service-account-key-file=/etc/kubernetes/pki/sa.pub
    - --service-cluster-ip-range=11.96.0.0/12             # change
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
..

Give it a bit for the kube-apiserver and controller-manager to restart

Wait for the api to be up again:

➜ root@cluster2-controlplane1:~# kubectl -n kube-system get pod | grep api
kube-apiserver-cluster2-controlplane1            1/1     Running   0              49s

Now we do the same for the controller manager:

➜ root@cluster2-controlplane1:~# vim /etc/kubernetes/manifests/kube-controller-manager.yaml

# /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=10.244.0.0/16
    - --cluster-name=kubernetes
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
    - --node-cidr-mask-size=24
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=11.96.0.0/12         # change
    - --use-service-account-credentials=true

Give it a bit for the scheduler to restart.

We can check if it was restarted using crictl:

➜ root@cluster2-controlplane1:~# crictl ps | grep scheduler3d258934b9fd6    aca5ededae9c8    About a minute ago   Running    kube-scheduler ...

Checking our existing Pod and Service again:

➜ k get pod,svc -l run=check-ip
NAME           READY   STATUS    RESTARTS   AGE
pod/check-ip   1/1     Running   0          21m

NAME                       TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/check-ip-service   ClusterIP   10.99.32.177   <none>        80/TCP    21m

Nothing changed so far. Now we create another Service like before:

k expose pod check-ip --name check-ip-service2 --port 80

And check again:

➜ k get svc,ep -l run=check-ip
NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/check-ip-service    ClusterIP   10.109.222.111   <none>        80/TCP    8m
service/check-ip-service2   ClusterIP   11.111.108.194   <none>        80/TCP    6m32s

NAME                          ENDPOINTS      AGE
endpoints/check-ip-service    10.44.0.1:80   8m
endpoints/check-ip-service2   10.44.0.1:80   6m13s

There we go, the new Service got an ip of the new specified range assigned. We also see that both Services have our Pod as endpoint.

30. Schedule Pod on Controlplane Nodes

Use context: kubectl config use-context k8s-c1-H

Create a single Pod of image httpd:2.4.41-alpine in Namespace default. The Pod should be named pod1 and the container should be named pod1-container. This Pod should only be scheduled on controlplane nodes. Do not add new labels to any nodes.

First we find the controlplane node(s) and their taints:

k get node 
# find controlplane node
k describe node cluster1-controlplane1 | grep Taint -A1 
# get controlplane node taints
k get node cluster1-controlplane1 --show-labels # get controlplane node labels

Next we create the Pod template:

k run pod1 --image=httpd:2.4.41-alpine $do > 2.yaml
vim 2.yaml

Perform the necessary changes manually. Use the Kubernetes docs and search for example for tolerations and nodeSelector to find examples:

# 2.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: pod1
  name: pod1
spec:
  containers:
  - image: httpd:2.4.41-alpine
    name: pod1-container                       # change
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
  tolerations:                                 # add
  - effect: NoSchedule                         # add
    key: node-role.kubernetes.io/control-plane # add
  nodeSelector:                                # add
    node-role.kubernetes.io/control-plane: ""  # add
status: {}

Important here to add the toleration for running on controlplane nodes, but also the nodeSelector to make sure it only runs on controlplane nodes. If we only specify a toleration the Pod can be scheduled on controlplane or worker nodes.

Now we create it:

k -f 2.yaml create

Let's check if the pod is scheduled:

➜ k get pod pod1 -o wide
NAME   READY   STATUS    RESTARTS   ...    NODE                     NOMINATED NODEpod1   1/1     Running   0          ...    cluster1-controlplane1   <none>

31. Scale down StatefulSet

Use context: kubectl config use-context k8s-c1-H

There are two Pods named o3db-* in Namespace project-c13. C13 management asked you to scale the Pods down to one replica to save resources.

If we check the Pods we see two replicas:

➜ k -n project-c13 get pod | grep o3db
o3db-0                                  1/1     Running   0          52s
o3db-1                                  1/1     Running   0          42

From their name it looks like these are managed by a StatefulSet. But if we're not sure we could also check for the most common resources which manage Pods:

➜ k -n project-c13 get deploy,ds,sts | grep o3db
statefulset.apps/o3db   2/2     2m56s

Confirmed, we have to work with a StatefulSet. To find this out we could also look at the Pod labels:

➜ k -n project-c13 get pod --show-labels | grep o3db
o3db-0                                  1/1     Running   0          3m29s   app=nginx,controller-revision-hash=o3db-5fbd4bb9cc,statefulset.kubernetes.io/pod-name=o3db-0
o3db-1                                  1/1     Running   0          3m19s   app=nginx,controller-revision-hash=o3db-5fbd4bb9cc,statefulset.kubernetes.io/pod-name=o3db-1

To fulfil the task we simply run:

➜ k -n project-c13 scale sts o3db --replicas 1
statefulset.apps/o3db scaled

➜ k -n project-c13 get sts o3db
NAME   READY   AGE
o3db   1/1     4m39s

C13 Mangement is happy again.

32. Pod Ready if Service is reachable

Use context: kubectl config use-context k8s-c1-H

Do the following in Namespace default. Create a single Pod named ready-if-service-ready of image nginx:1.16.1-alpine. Configure a LivenessProbe which simply executes command true. Also configure a ReadinessProbe which does check if the url http://service-am-i-ready:80 is reachable, you can use wget -T2 -O- http://service-am-i-ready:80 for this. Start the Pod and confirm it isn't ready because of the ReadinessProbe.

Create a second Pod named am-i-ready of image nginx:1.16.1-alpine with label id: cross-server-ready. The already existing Service service-am-i-ready should now have that second Pod as endpoint.

Now the first Pod should be in ready state, confirm that.

It's a bit of an anti-pattern for one Pod to check another Pod for being ready using probes, hence the normally available readinessProbe.httpGet doesn't work for absolute remote urls. Still the workaround requested in this task should show how probes and Pod ⬌ Service communication works.

First we create the first Pod:

k run ready-if-service-ready --image=nginx:1.16.1-alpine $do > 4_pod1.yaml
vim 4_pod1.yaml

Next perform the necessary additions manually:

# 4_pod1.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: ready-if-service-ready
  name: ready-if-service-ready
spec:
  containers:
  - image: nginx:1.16.1-alpine
    name: ready-if-service-ready
    resources: {}
    livenessProbe:                                      # add from here
      exec:
        command:
        - 'true'
    readinessProbe:
      exec:
        command:
        - sh
        - -c
        - 'wget -T2 -O- http://service-am-i-ready:80'   # to here
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

Then create the Pod:

k -f 4_pod1.yaml create

And confirm it's in a non-ready state:

➜ k get pod ready-if-service-ready
NAME                     READY   STATUS    RESTARTS   AGE
ready-if-service-ready   0/1     Running   0          7s

We can also check the reason for this using describe:

➜ k describe pod ready-if-service-ready
 ...
  Warning  Unhealthy  18s   kubelet, cluster1-node1  Readiness probe failed: Connecting to service-am-i-ready:80 (10.109.194.234:80)
wget: download timed out

Now we create the second Pod:

k run am-i-ready --image=nginx:1.16.1-alpine --labels="id=cross-server-ready"

The already existing Service service-am-i-ready should now have an Endpoint:

k describe svc service-am-i-ready
k get ep # also possible

Which will result in our first Pod being ready, just give it a minute for the Readiness probe to check again:

➜ k get pod ready-if-service-ready
NAME                     READY   STATUS    RESTARTS   AGE
ready-if-service-ready   1/1     Running   0          53s

Look at these Pods coworking together!

33. Kubectl sorting

Use context: kubectl config use-context k8s-c1-H

There are various Pods in all namespaces. Write a command into /opt/course/5/find_pods.sh which lists all Pods sorted by their AGE (metadata.creationTimestamp).

Write a second command into /opt/course/5/find_pods_uid.sh which lists all Pods sorted by field metadata.uid. Use kubectl sorting for both commands.

A good resources here (and for many other things) is the kubectl-cheat-sheet. You can reach it fast when searching for "cheat sheet" in the Kubernetes docs.

# /opt/course/5/find_pods.sh

kubectl get pod -A --sort-by=.metadata.creationTimestamp

And to execute:

➜ sh /opt/course/5/find_pods.sh
NAMESPACE         NAME                                             ...          AGE
kube-system       kube-scheduler-cluster1-controlplane1            ...          63m
kube-system       etcd-cluster1-controlplane1                      ...          63m
kube-system       kube-apiserver-cluster1-controlplane1            ...          63m
kube-system       kube-controller-manager-cluster1-controlplane1   ...          63m
...

For the second command:

# /opt/course/5/find_pods_uid.sh
kubectl get pod -A --sort-by=.metadata.ui

And to execute:

➜ sh /opt/course/5/find_pods_uid.sh
NAMESPACE         NAME                                      ...          AGE
kube-system       coredns-5644d7b6d9-vwm7g                  ...          68m
project-c13       c13-3cc-runner-heavy-5486d76dd4-ddvlt     ...          63m
project-hamster   web-hamster-shop-849966f479-278vp         ...          63m
project-c13       c13-3cc-web-646b6c8756-qsg4b              ...          63m

34. Storage, PV, PVC, Pod volume

Use context: kubectl config use-context k8s-c1-H

Create a new PersistentVolume named safari-pv. It should have a capacity of 2Gi, accessMode ReadWriteOnce, hostPath /Volumes/Data and no storageClassName defined.

Next create a new PersistentVolumeClaim in Namespace project-tiger named safari-pvc . It should request 2Gi storage, accessMode ReadWriteOnce and should not define a storageClassName. The PVC should bound to the PV correctly.

Finally create a new Deployment safari in Namespace project-tiger which mounts that volume at /tmp/safari-data. The Pods of that Deployment should be of image httpd:2.4.41-alpine.

vim 6_pv.yaml

Find an example from https://kubernetes.io/docs and alter it:

# 6_pv.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
 name: safari-pv
spec:
 capacity:
  storage: 2Gi
 accessModes:
  - ReadWriteOnce
 hostPath:
  path: "/Volumes/Data"

Then create it:

k -f 6_pvc.yaml create

Next the PersistentVolumeClaim:

vim 6_pvc.yaml

Find an example from https://kubernetes.io/docs and alter it:

# 6_pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: safari-pvc
  namespace: project-tiger
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
     storage: 2Gi

Then create:

k -f 6_pvc.yaml create

And check that both have the status Bound:

➜ k -n project-tiger get pv,pvc
NAME                         CAPACITY  ... STATUS   CLAIM                    ...
persistentvolume/safari-pv   2Gi       ... Bound    project-tiger/safari-pvc ...

NAME                               STATUS   VOLUME      CAPACITY ...
persistentvolumeclaim/safari-pvc   Bound    safari-pv   2Gi      ..

Next we create a Deployment and mount that volume:

k -n project-tiger create deploy safari \
  --image=httpd:2.4.41-alpine $do > 6_dep.yaml

vim 6_dep.yaml

Alter the yaml to mount the volume:

# 6_dep.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: safari
  name: safari
  namespace: project-tiger
spec:
  replicas: 1
  selector:
    matchLabels:
      app: safari
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: safari
    spec:
      volumes:                                      # add
      - name: data                                  # add
        persistentVolumeClaim:                      # add
          claimName: safari-pvc                     # add
      containers:
      - image: httpd:2.4.41-alpine
        name: container
        volumeMounts:                               # add
        - name: data                                # add
          mountPath: /tmp/safari-data               # add

k -f 6_dep.yaml create

We can confirm it's mounting correctly:

➜ k -n project-tiger describe pod safari-5cbf46d6d-mjhsb  | grep -A2 Mounts:   
    Mounts:
      /tmp/safari-data from data (rw) # there it is
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-n2sjj (ro)

35. Node and Pod Resource Usage

Use context: kubectl config use-context k8s-c1-H

The metrics-server has been installed in the cluster. Your college would like to know the kubectl commands to:

show Nodes resource usage
show Pods and their containers resource usage

Please write the commands into /opt/course/7/node.sh and /opt/course/7/pod.sh.

The command we need to use here is top:

➜ k top -h
Display Resource (CPU/Memory/Storage) usage.

 The top command allows you to see the resource consumption for nodes or pods.

 This command requires Metrics Server to be correctly configured and working on the server.

Available Commands:
  node        Display Resource (CPU/Memory/Storage) usage of nodes
  pod         Display Resource (CPU/Memory/Storage) usage of pods

We see that the metrics server provides information about resource usage:

➜ k top node
NAME               CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
cluster1-controlplane1   178m         8%     1091Mi          57%       
cluster1-node1   66m          6%     834Mi           44%       
cluster1-node2   91m          9%     791Mi           41

We create the first file:

# /opt/course/7/node.sh
kubectl top node

For the second file we might need to check the docs again:

➜ k top pod -h
Display Resource (CPU/Memory/Storage) usage of pods.
...
Namespace in current context is ignored even if specified with --namespace.
      --containers=false: If present, print usage of containers within a pod.
      --no-headers=false: If present, print output without headers.
...

With this we can finish this task:

# /opt/course/7/pod.sh
kubectl top pod --containers=true

36. Get Controlplane Information

Ssh into the controlplane node with ssh cluster1-controlplane1. Check how the controlplane components kubelet, kube-apiserver, kube-scheduler, kube-controller-manager and etcd are started/installed on the controlplane node. Also find out the name of the DNS application and how it's started/installed on the controlplane node.

Write your findings into file /opt/course/8/controlplane-components.txt. The file should be structured like:

# /opt/course/8/controlplane-components.txt
kubelet: [TYPE]
kube-apiserver: [TYPE]
kube-scheduler: [TYPE]
kube-controller-manager: [TYPE]
etcd: [TYPE]
dns: [TYPE] [NAME]

Choices of [TYPE] are: not-installed, process, static-pod, pod

We could start by finding processes of the requested components, especially the kubelet at first:

➜ ssh cluster1-controlplane1

root@cluster1-controlplane1:~# ps aux | grep kubelet # shows kubelet process

We can see which components are controlled via systemd looking at /etc/systemd/system directory:

➜ root@cluster1-controlplane1:~# find /etc/systemd/system/ | grep kube
/etc/systemd/system/kubelet.service.d
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
/etc/systemd/system/multi-user.target.wants/kubelet.service

➜ root@cluster1-controlplane1:~# find /etc/systemd/system/ | grep etcd

This shows kubelet is controlled via systemd, but no other service named kube nor etcd. It seems that this cluster has been setup using kubeadm, so we check in the default manifests directory:

➜ root@cluster1-controlplane1:~# find /etc/kubernetes/manifests/
/etc/kubernetes/manifests/
/etc/kubernetes/manifests/kube-controller-manager.yaml
/etc/kubernetes/manifests/etcd.yaml
/etc/kubernetes/manifests/kube-apiserver.yaml
/etc/kubernetes/manifests/kube-scheduler.yaml

(The kubelet could also have a different manifests directory specified via parameter --pod-manifest-path in it's systemd startup config)

This means the main 4 controlplane services are setup as static Pods. Actually, let's check all Pods running on in the kube-system Namespace on the controlplane node:

➜ root@cluster1-controlplane1:~# kubectl -n kube-system get pod -o wide | grep controlplane1
coredns-5644d7b6d9-c4f68                            1/1     Running            ...   cluster1-controlplane1
coredns-5644d7b6d9-t84sc                            1/1     Running            ...   cluster1-controlplane1
etcd-cluster1-controlplane1                         1/1     Running            ...   cluster1-controlplane1
kube-apiserver-cluster1-controlplane1               1/1     Running            ...   cluster1-controlplane1
kube-controller-manager-cluster1-controlplane1      1/1     Running            ...   cluster1-controlplane1
kube-proxy-q955p                                    1/1     Running            ...   cluster1-controlplane1
kube-scheduler-cluster1-controlplane1               1/1     Running            ...   cluster1-controlplane1
weave-net-mwj47                                     2/2     Running            ...   cluster1-controlplane1

There we see the 5 static pods, with -cluster1-controlplane1 as suffix.

We also see that the dns application seems to be coredns, but how is it controlled?

➜ root@cluster1-controlplane1$ kubectl -n kube-system get ds
NAME         DESIRED   CURRENT   ...   NODE SELECTOR            AGE
kube-proxy   3         3         ...   kubernetes.io/os=linux   155m
weave-net    3         3         ...   <none>                   155m

➜ root@cluster1-controlplane1$ kubectl -n kube-system get deploy
NAME      READY   UP-TO-DATE   AVAILABLE   AGE
coredns   2/2     2            2           155m

Seems like coredns is controlled via a Deployment. We combine our findings in the requested file:

# /opt/course/8/controlplane-components.txt
kubelet: process
kube-apiserver: static-pod
kube-scheduler: static-pod
kube-controller-manager: static-pod
etcd: static-pod
dns: pod coredns

You should be comfortable investigating a running cluster, know different methods on how a cluster and its services can be setup and be able to troubleshoot and find error sources.

37. Kill Scheduler, Manual Scheduling

Use context: kubectl config use-context k8s-c2-AC

Ssh into the controlplane node with ssh cluster2-controlplane1. Temporarily stop the kube-scheduler, this means in a way that you can start it again afterwards.

Create a single Pod named manual-schedule of image httpd:2.4-alpine, confirm it's created but not scheduled on any node.

Now you're the scheduler and have all its power, manually schedule that Pod on node cluster2-controlplane1. Make sure it's running.

Start the kube-scheduler again and confirm it's running correctly by creating a second Pod named manual-schedule2 of image httpd:2.4-alpine and check if it's running on cluster2-node1.

Stop the Scheduler

First we find the controlplane node:

➜ k get node
NAME                     STATUS   ROLES           AGE   VERSION
cluster2-controlplane1   Ready    control-plane   26h   v1.28.2
cluster2-node1           Ready    <none>          26h   v1.28

Then we connect and check if the scheduler is running:

➜ ssh cluster2-controlplane1

➜ root@cluster2-controlplane1:~# kubectl -n kube-system get pod | grep schedule
kube-scheduler-cluster2-controlplane1            1/1     Running   0          

Kill the Scheduler (temporarily):

➜ root@cluster2-controlplane1:~# cd /etc/kubernetes/manifests/

➜ root@cluster2-controlplane1:~# mv kube-scheduler.yaml .

And it should be stopped:

➜ root@cluster2-controlplane1:~# kubectl -n kube-system get pod | grep schedule

➜ root@cluster2-controlplane1:~#

Create a Pod

Now we create the Pod:

k run manual-schedule --image=httpd:2.4-alpine

And confirm it has no node assigned:

➜ k get pod manual-schedule -o wide
NAME              READY   STATUS    ...   NODE     NOMINATED NODE
manual-schedule   0/1     Pending   ...   <none>   <none>

Manually schedule the Pod

Let's play the scheduler now:

k get pod manual-schedule -o yaml > 9.yaml

# 9.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2020-09-04T15:51:02Z"
  labels:
    run: manual-schedule
  managedFields:
...
    manager: kubectl-run
    operation: Update
    time: "2020-09-04T15:51:02Z"
  name: manual-schedule
  namespace: default
  resourceVersion: "3515"
  selfLink: /api/v1/namespaces/default/pods/manual-schedule
  uid: 8e9d2532-4779-4e63-b5af-feb82c74a935
spec:
  nodeName: cluster2-controlplane1        # add the controlplane node name
  containers:
  - image: httpd:2.4-alpine
    imagePullPolicy: IfNotPresent
    name: manual-schedule
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-nxnc7
      readOnly: true
  dnsPolicy: ClusterFirst
...

The only thing a scheduler does, is that it sets the nodeName for a Pod declaration. How it finds the correct node to schedule on, that's a very much complicated matter and takes many variables into account.

As we cannot kubectl apply or kubectl edit , in this case we need to delete and create or replace:

k -f 9.yaml replace --force

How does it look?

➜ k get pod manual-schedule -o wide
NAME              READY   STATUS    ...   NODE            
manual-schedule   1/1     Running   ...   cluster2-controlplane1

It looks like our Pod is running on the controlplane now as requested, although no tolerations were specified. Only the scheduler takes tains/tolerations/affinity into account when finding the correct node name. That's why it's still possible to assign Pods manually directly to a controlplane node and skip the scheduler.

Start the scheduler again

➜ ssh cluster2-controlplane1

➜ root@cluster2-controlplane1:~# cd /etc/kubernetes/manifests/

➜ root@cluster2-controlplane1:~# mv ../kube-scheduler.yaml .

Checks it's running:

➜ root@cluster2-controlplane1:~# kubectl -n kube-system get pod | grep schedulekube-scheduler-cluster2-controlplane1            1/1     Running   0          16s

Schedule a second test Pod:

k run manual-schedule2 --image=httpd:2.4-alpine

➜ k get pod -o wide | grep schedule
manual-schedule    1/1     Running   ...   cluster2-controlplane1
manual-schedule2   1/1     Running   ...   cluster2-node1

Back to normal.

38. RBAC ServiceAccount Role RoleBinding

Use context: kubectl config use-context k8s-c1-H

Create a new ServiceAccount processor in Namespace project-hamster. Create a Role and RoleBinding, both named processor as well. These should allow the new SA to only create Secrets and ConfigMaps in that Namespace.

Let's talk a little about RBAC resources

A ClusterRole|Role defines a set of permissions and where it is available, in the whole cluster or just a single Namespace.

A ClusterRoleBinding|RoleBinding connects a set of permissions with an account and defines where it is applied, in the whole cluster or just a single Namespace.

Because of this there are 4 different RBAC combinations and 3 valid ones:

Role + RoleBinding (available in single Namespace, applied in single Namespace)
ClusterRole + ClusterRoleBinding (available cluster-wide, applied cluster-wide)
ClusterRole + RoleBinding (available cluster-wide, applied in single Namespace)
Role + ClusterRoleBinding (NOT POSSIBLE: available in single Namespace, applied cluster-wide)

To the solution

We first create the ServiceAccount:

➜ k -n project-hamster create sa processorserviceaccount/processor created

Then for the Role:

k -n project-hamster create role -h # examples

So we execute:

k -n project-hamster create role processor \  --verb=create \  --resource=secret \  --resource=configmap

Which will create a Role like:

# kubectl -n project-hamster create role processor --verb=create --resource=secret --resource=configmap
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: processor
  namespace: project-hamster
rules:
- apiGroups:
  - ""
  resources:
  - secrets
  - configmaps
  verbs:
  - create

Now we bind the Role to the ServiceAccount:

k -n project-hamster create rolebinding -h # examples

So we create it:

k -n project-hamster create rolebinding processor \  --role processor \  --serviceaccount project-hamster:processor

This will create a RoleBinding like:

# kubectl -n project-hamster create rolebinding processor --role processor --serviceaccount project-hamster:processor
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: processor
  namespace: project-hamster
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: processor
subjects:
- kind: ServiceAccount
  name: processor
  namespace: project-hamste

To test our RBAC setup we can use kubectl auth can-i:

k auth can-i -h # examples

Like this:

➜ k -n project-hamster auth can-i create secret \
  --as system:serviceaccount:project-hamster:processor
yes

➜ k -n project-hamster auth can-i create configmap \
  --as system:serviceaccount:project-hamster:processor
yes

➜ k -n project-hamster auth can-i create pod \
  --as system:serviceaccount:project-hamster:processor
no

➜ k -n project-hamster auth can-i delete secret \
  --as system:serviceaccount:project-hamster:processor
no

➜ k -n project-hamster auth can-i get configmap \
  --as system:serviceaccount:project-hamster:processor
no

Done.

39. DaemonSet on all Nodes

Use context: kubectl config use-context k8s-c1-H

Use Namespace project-tiger for the following. Create a DaemonSet named ds-important with image httpd:2.4-alpine and labels id=ds-important and uuid=18426a0b-5f59-4e10-923f-c0e078e82462. The Pods it creates should request 10 millicore cpu and 10 mebibyte memory. The Pods of that DaemonSet should run on all nodes, also controlplanes.

As of now we aren't able to create a DaemonSet directly using kubectl, so we create a Deployment and just change it up:

k -n project-tiger create deployment --image=httpd:2.4-alpine ds-important $do > 11.yaml

vim 11.yaml

(Sure you could also search for a DaemonSet example yaml in the Kubernetes docs and alter it.)

Then we adjust the yaml to:

# 11.yaml
apiVersion: apps/v1
kind: DaemonSet                           # change from Deployment to Daemonset
metadata:
  creationTimestamp: null
  labels:                                           # add
    id: ds-important                                # add
    uuid: 18426a0b-5f59-4e10-923f-c0e078e82462      # add
  name: ds-important
  namespace: project-tiger                          # important
spec:
  #replicas: 1                                      # remove
  selector:
    matchLabels:
      id: ds-important                              # add
      uuid: 18426a0b-5f59-4e10-923f-c0e078e82462    # add
  #strategy: {}                                     # remove
  template:
    metadata:
      creationTimestamp: null
      labels:
        id: ds-important                            # add
        uuid: 18426a0b-5f59-4e10-923f-c0e078e82462  # add
    spec:
      containers:
      - image: httpd:2.4-alpine
        name: ds-important
        resources:
          requests:                                 # add
            cpu: 10m                                # add
            memory: 10Mi                            # add
      tolerations:                                  # add
      - effect: NoSchedule                          # add
        key: node-role.kubernetes.io/control-plane  # add
#status: {}                                         # remove

It was requested that the DaemonSet runs on all nodes, so we need to specify the toleration for this.

Let's confirm:

k -f 11.yaml create

➜ k -n project-tiger get dsNAME           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGEds-important   3         3         3       3            3           <none>          8s

➜ k -n project-tiger get pod -l id=ds-important -o wide
NAME                      READY   STATUS          NODE
ds-important-6pvgm        1/1     Running   ...   cluster1-node1
ds-important-lh5ts        1/1     Running   ...   cluster1-controlplane1
ds-important-qhjcq        1/1     Running   ...   cluster1-node2

40. Deployment on all Nodes

Use context: kubectl config use-context k8s-c1-H

Use Namespace project-tiger for the following. Create a Deployment named deploy-important with label id=very-important (the Pods should also have this label) and 3 replicas. It should contain two containers, the first named container1 with image nginx:1.17.6-alpine and the second one named container2 with image google/pause.

There should be only ever one Pod of that Deployment running on one worker node. We have two worker nodes: cluster1-node1 and cluster1-node2. Because the Deployment has three replicas the result should be that on both nodes one Pod is running. The third Pod won't be scheduled, unless a new worker node will be added.

In a way we kind of simulate the behaviour of a DaemonSet here, but using a Deployment and a fixed number of replicas.

There are two possible ways, one using podAntiAffinity and one using topologySpreadConstraint.

PodAntiAffinity

The idea here is that we create a "Inter-pod anti-affinity" which allows us to say a Pod should only be scheduled on a node where another Pod of a specific label (here the same label) is not already running.

Let's begin by creating the Deployment template:


k -n project-tiger create deployment \  --image=nginx:1.17.6-alpine deploy-important $do > 12.yamlvim 12.yaml

Then change the yaml to:

# 12.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    id: very-important                  # change
  name: deploy-important
  namespace: project-tiger              # important
spec:
  replicas: 3                           # change
  selector:
    matchLabels:
      id: very-important                # change
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        id: very-important              # change
    spec:
      containers:
      - image: nginx:1.17.6-alpine
        name: container1                # change
        resources: {}
      - image: google/pause             # add
        name: container2                # add
      affinity:                                             # add
        podAntiAffinity:                                    # add
          requiredDuringSchedulingIgnoredDuringExecution:   # add
          - labelSelector:                                  # add
              matchExpressions:                             # add
              - key: id                                     # add
                operator: In                                # add
                values:                                     # add
                - very-important                            # add
            topologyKey: kubernetes.io/hostname             # add
status: {}

Specify a topologyKey, which is a pre-populated Kubernetes label, you can find this by describing a node.

TopologySpreadConstraints

We can achieve the same with topologySpreadConstraints. Best to try out and play with both.

# 12.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    id: very-important                  # change
  name: deploy-important
  namespace: project-tiger              # important
spec:
  replicas: 3                           # change
  selector:
    matchLabels:
      id: very-important                # change
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        id: very-important              # change
    spec:
      containers:
      - image: nginx:1.17.6-alpine
        name: container1                # change
        resources: {}
      - image: google/pause             # add
        name: container2                # add
      topologySpreadConstraints:                 # add
      - maxSkew: 1                               # add
        topologyKey: kubernetes.io/hostname      # add
        whenUnsatisfiable: DoNotSchedule         # add
        labelSelector:                           # add
          matchLabels:                           # add
            id: very-important                   # add
status: {}

Apply and Run

Let's run it:

k -f 12.yaml create

Then we check the Deployment status where it shows 2/3 ready count:

➜ k -n project-tiger get deploy -l id=very-important
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
deploy-important   2/3     3            2           2m35s

And running the following we see one Pod on each worker node and one not scheduled.

➜ k -n project-tiger get pod -o wide -l id=very-important
NAME                                READY   STATUS    ...   NODE             
deploy-important-58db9db6fc-9ljpw   2/2     Running   ...   cluster1-node1
deploy-important-58db9db6fc-lnxdb   0/2     Pending   ...   <none>          
deploy-important-58db9db6fc-p2rz8   2/2     Running   ...   cluster1-node2

If we kubectl describe the Pod deploy-important-58db9db6fc-lnxdb it will show us the reason for not scheduling is our implemented podAntiAffinity ruling:

Warning  FailedScheduling  63s (x3 over 65s)  default-scheduler  0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/control-plane: }, that the pod didn't tolerate, 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules

Or our topologySpreadConstraints:

Warning  FailedScheduling  16s   default-scheduler  0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/control-plane: }, that the pod didn't tolerate, 2 node(s) didn't match pod topology spread constraints.

41. Multi Containers and Pod shared Volume

Use context: kubectl config use-context k8s-c1-H

Create a Pod named multi-container-playground in Namespace default with three containers, named c1, c2 and c3. There should be a volume attached to that Pod and mounted into every container, but the volume shouldn't be persisted or shared with other Pods.

Container c1 should be of image nginx:1.17.6-alpine and have the name of the node where its Pod is running available as environment variable MY_NODE_NAME.

Container c2 should be of image busybox:1.31.1 and write the output of the date command every second in the shared volume into file date.log. You can use while true; do date >> /your/vol/path/date.log; sleep 1; done for this.

Container c3 should be of image busybox:1.31.1 and constantly send the content of file date.log from the shared volume to stdout. You can use tail -f /your/vol/path/date.log for this.

Check the logs of container c3 to confirm correct setup.

First we create the Pod template:

k run multi-container-playground --image=nginx:1.17.6-alpine $do > 13.yamlvim 13.yaml

And add the other containers and the commands they should execute:

# 13.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: multi-container-playground
  name: multi-container-playground
spec:
  containers:
  - image: nginx:1.17.6-alpine
    name: c1                                                          # change
    resources: {}
    env:                                                              # add
    - name: MY_NODE_NAME                                              # add
      valueFrom:                                                      # add
        fieldRef:                                                     # add
          fieldPath: spec.nodeName                                    # add
    volumeMounts:                                                     # add
    - name: vol                                                       # add
      mountPath: /vol                                                 # add
  - image: busybox:1.31.1                                             # add
    name: c2                                                          # add
    command: ["sh", "-c", "while true; do date >> /vol/date.log; sleep 1; done"]  # add
    volumeMounts:                                                     # add
    - name: vol                                                       # add
      mountPath: /vol                                                 # add
  - image: busybox:1.31.1                                             # add
    name: c3                                                          # add
    command: ["sh", "-c", "tail -f /vol/date.log"]                    # add
    volumeMounts:                                                     # add
    - name: vol                                                       # add
      mountPath: /vol                                                 # add
  dnsPolicy: ClusterFirst
  restartPolicy: Always
  volumes:                                                            # add
    - name: vol                                                       # add
      emptyDir: {}                                                    # add
status: {}

k -f 13.yaml create

Oh boy, lot's of requested things. We check if everything is good with the Pod:

➜ k get pod multi-container-playground
NAME                         READY   STATUS    RESTARTS   AGE
multi-container-playground   3/3     Running   0          95s

Good, then we check if container c1 has the requested node name as env variable:

➜ k exec multi-container-playground -c c1 -- env | grep MY
MY_NODE_NAME=cluster1-node2

And finally we check the logging:

➜ k logs multi-container-playground -c c3
Sat Dec  7 16:05:10 UTC 2077
Sat Dec  7 16:05:11 UTC 2077
Sat Dec  7 16:05:12 UTC 2077
Sat Dec  7 16:05:13 UTC 2077
Sat Dec  7 16:05:14 UTC 2077
Sat Dec  7 16:05:15 UTC 2077
Sat Dec  7 16:05:16 UTC 2077

42. Find out Cluster Information

Use context: kubectl config use-context k8s-c1-H

You're ask to find out following information about the cluster k8s-c1-H:

How many controlplane nodes are available?
How many worker nodes are available?
What is the Service CIDR?
Which Networking (or CNI Plugin) is configured and where is its config file?
Which suffix will static pods have that run on cluster1-node1?

Write your answers into file /opt/course/14/cluster-info, structured like this:

# /opt/course/14/cluster-info
[ANSWER]
[ANSWER]
[ANSWER]
[ANSWER]
[ANSWER]

How many controlplane and worker nodes are available?

➜ k get node
NAME                    STATUS   ROLES          AGE   VERSION
cluster1-controlplane1  Ready    control-plane  27h   v1.28.2
cluster1-node1          Ready    <none>         27h   v1.28.2
cluster1-node2          Ready    <none>         27h   v1.28.2

We see one controlplane and two workers.

What is the Service CIDR?

➜ ssh cluster1-controlplane1
➜ root@cluster1-controlplane1:~# cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep range    
- --service-cluster-ip-range=10.96.0.0/12

Which Networking (or CNI Plugin) is configured and where is its config file?

➜ root@cluster1-controlplane1:~# find /etc/cni/net.d//etc/cni/net.d//etc/cni/net.d/10-weave.conflist
➜ root@cluster1-controlplane1:~# cat /etc/cni/net.d/10-weave.conflist{    "cniVersion": "0.3.0",    "name": "weave",...

By default the kubelet looks into /etc/cni/net.d to discover the CNI plugins. This will be the same on every controlplane and worker nodes.

Which suffix will static pods have that run on cluster1-node1?

The suffix is the node hostname with a leading hyphen. It used to be -static in earlier Kubernetes versions.

Result

The resulting /opt/course/14/cluster-info could look like:

# /opt/course/14/cluster-info

# How many controlplane nodes are available?
1: 1

# How many worker nodes are available?
2: 2

# What is the Service CIDR?
3: 10.96.0.0/12

# Which Networking (or CNI Plugin) is configured and where is its config file?
4: Weave, /etc/cni/net.d/10-weave.conflist

# Which suffix will static pods have that run on cluster1-node1?
5: -cluster1-node1

43. Cluster Event Logging

Use context: kubectl config use-context k8s-c2-AC

Write a command into /opt/course/15/cluster_events.sh which shows the latest events in the whole cluster, ordered by time (metadata.creationTimestamp). Use kubectl for it.

Now delete the kube-proxy Pod running on node cluster2-node1 and write the events this caused into /opt/course/15/pod_kill.log.

Finally kill the containerd container of the kube-proxy Pod on node cluster2-node1 and write the events into /opt/course/15/container_kill.log.

Do you notice differences in the events both actions caused?

# /opt/course/15/cluster_events.sh
kubectl get events -A --sort-by=.metadata.creationTimestamp

Now we delete the kube-proxy Pod:

k -n kube-system get pod -o wide | grep proxy # find pod running on cluster2-node1

k -n kube-system delete pod kube-proxy-z64cg

Now check the events:

sh /opt/course/15/cluster_events.sh

Write the events the killing caused into /opt/course/15/pod_kill.log:

# /opt/course/15/pod_kill.log
kube-system   9s          Normal    Killing           pod/kube-proxy-jsv7t   ...
kube-system   3s          Normal    SuccessfulCreate  daemonset/kube-proxy   ...
kube-system   <unknown>   Normal    Scheduled         pod/kube-proxy-m52sx   ...
default       2s          Normal    Starting          node/cluster2-node1  ...
kube-system   2s          Normal    Created           pod/kube-proxy-m52sx   ...
kube-system   2s          Normal    Pulled            pod/kube-proxy-m52sx   ...
kube-system   2s          Normal    Started           pod/kube-proxy-m52sx   ...

Finally we will try to provoke events by killing the container belonging to the container of the kube-proxy Pod:

➜ ssh cluster2-node1

➜ root@cluster2-node1:~# crictl ps | grep kube-proxy
1e020b43c4423   36c4ebbc9d979   About an hour ago   Running   kube-proxy     ...

➜ root@cluster2-node1:~# crictl rm 1e020b43c4423
1e020b43c4423

➜ root@cluster2-node1:~# crictl ps | grep kube-proxy
0ae4245707910   36c4ebbc9d979   17 seconds ago      Running   kube-proxy     ...

We killed the main container (1e020b43c4423), but also noticed that a new container (0ae4245707910) was directly created. Thanks Kubernetes!

Now we see if this caused events again and we write those into the second file:

sh /opt/course/15/cluster_events.sh

# /opt/course/15/container_kill.log
kube-system   13s         Normal    Created      pod/kube-proxy-m52sx    ...
kube-system   13s         Normal    Pulled       pod/kube-proxy-m52sx    ...
kube-system   13s         Normal    Started      pod/kube-proxy-m52sx    ...

Comparing the events we see that when we deleted the whole Pod there were more things to be done, hence more events. For example was the DaemonSet in the game to re-create the missing Pod. Where when we manually killed the main container of the Pod, the Pod would still exist but only its container needed to be re-created, hence less events.

44. Namespaces and Api Resources

Use context: kubectl config use-context k8s-c1-H

Write the names of all namespaced Kubernetes resources (like Pod, Secret, ConfigMap...) into /opt/course/16/resources.txt.

Find the project-* Namespace with the highest number of Roles defined in it and write its name and amount of Roles into /opt/course/16/crowded-namespace.txt.

Namespace and Namespaces Resources

Now we can get a list of all resources like:

k api-resources    # shows all

k api-resources -h # help always good

k api-resources --namespaced -o name > /opt/course/16/resources.txt

Which results in the file:

# /opt/course/16/resources.txt
bindings
configmaps
endpoints
events
limitranges
persistentvolumeclaims
pods
podtemplates
replicationcontrollers
resourcequotas
secrets
serviceaccounts
services
controllerrevisions.apps
daemonsets.apps
deployments.apps
replicasets.apps
statefulsets.apps
localsubjectaccessreviews.authorization.k8s.io
horizontalpodautoscalers.autoscaling
cronjobs.batch
jobs.batch
leases.coordination.k8s.io
events.events.k8s.io
ingresses.extensions
ingresses.networking.k8s.io
networkpolicies.networking.k8s.io
poddisruptionbudgets.policy
rolebindings.rbac.authorization.k8s.io
roles.rbac.authorization.k8s.io

Namespace with most Roles

➜ k -n project-c13 get role --no-headers | wc -l
No resources found in project-c13 namespace.
0

➜ k -n project-c14 get role --no-headers | wc -l
300

➜ k -n project-hamster get role --no-headers | wc -l
No resources found in project-hamster namespace.
0

➜ k -n project-snake get role --no-headers | wc -l
No resources found in project-snake namespace.
0

➜ k -n project-tiger get role --no-headers | wc -l
No resources found in project-tiger namespace.
0

Finally we write the name and amount into the file:

# /opt/course/16/crowded-namespace.txt
project-c14 with 300 resources

45. Find Container of Pod and check info

Use context: kubectl config use-context k8s-c1-H

In Namespace project-tiger create a Pod named tigers-reunite of image httpd:2.4.41-alpine with labels pod=container and container=pod. Find out on which node the Pod is scheduled. Ssh into that node and find the containerd container belonging to that Pod.

Using command crictl:

Write the ID of the container and the info.runtimeType into /opt/course/17/pod-container.txt
Write the logs of the container into /opt/course/17/pod-container.log

First we create the Pod:

k -n project-tiger run tigers-reunite \  --image=httpd:2.4.41-alpine \  --labels "pod=container,container=pod"

Next we find out the node it's scheduled on:

k -n project-tiger get pod -o wide

# or fancy:k -n project-tiger get pod tigers-reunite -o jsonpath="{.spec.nodeName}"

Then we ssh into that node and and check the container info:

➜ ssh cluster1-node2

➜ root@cluster1-node2:~# crictl ps | grep tigers-reunite
b01edbe6f89ed    54b0995a63052    5 seconds ago    Running        tigers-reunite ...

➜ root@cluster1-node2:~# crictl inspect b01edbe6f89ed | grep runtimeType
    "runtimeType": "io.containerd.runc.v2",

Then we fill the requested file (on the main terminal):

# /opt/course/17/pod-container.txt
b01edbe6f89ed io.containerd.runc.v2

Finally we write the container logs in the second file:

ssh cluster1-node2 'crictl logs b01edbe6f89ed' &> /opt/course/17/pod-container.log

The &> in above's command redirects both the standard output and standard error.

You could also simply run crictl logs on the node and copy the content manually, if it's not a lot. The file should look like:

# /opt/course/17/pod-container.log
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.44.0.37. Set the 'ServerName' directive globally to suppress this message
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.44.0.37. Set the 'ServerName' directive globally to suppress this message
[Mon Sep 13 13:32:18.555280 2021] [mpm_event:notice] [pid 1:tid 139929534545224] AH00489: Apache/2.4.41 (Unix) configured -- resuming normal operations
[Mon Sep 13 13:32:18.555610 2021] [core:notice] [pid 1:tid 139929534545224] AH00094: Command line: 'httpd -D FOREGROUND'

46. Fix Kubelet

Use context: kubectl config use-context k8s-c3-CCC

There seems to be an issue with the kubelet not running on cluster3-node1. Fix it and confirm that cluster has node cluster3-node1 available in Ready state afterwards. You should be able to schedule a Pod on cluster3-node1 afterwards.

Write the reason of the issue into /opt/course/18/reason.txt.

The procedure on tasks like these should be to check if the kubelet is running, if not start it, then check its logs and correct errors if there are some.

Always helpful to check if other clusters already have some of the components defined and running, so you can copy and use existing config files. Though in this case it might not need to be necessary.

Check node status:

➜ k get node
NAME                     STATUS     ROLES           AGE   VERSION
cluster3-controlplane1   Ready      control-plane   14d   v1.28.2
cluster3-node1           NotReady   <none>          14d   v1.28.2

First we check if the kubelet is running:

➜ ssh cluster3-node1

➜ root@cluster3-node1:~# ps aux | grep kubelet
root     29294  0.0  0.2  14856  1016 pts/0    S+   11:30   0:00 grep --color=auto kubelet

Nope, so we check if it's configured using systemd as service:

➜ root@cluster3-node1:~# service kubelet status
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: inactive (dead) since Sun 2019-12-08 11:30:06 UTC; 50min 52s ago
...

Yes, it's configured as a service with config at /etc/systemd/system/kubelet.service.d/10-kubeadm.conf, but we see it's inactive. Let's try to start it:

➜ root@cluster3-node1:~# service kubelet start

➜ root@cluster3-node1:~# service kubelet status
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: activating (auto-restart) (Result: exit-code) since Thu 2020-04-30 22:03:10 UTC; 3s ago
     Docs: https://kubernetes.io/docs/home/
  Process: 5989 ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=203/EXEC)
 Main PID: 5989 (code=exited, status=203/EXEC)

Apr 30 22:03:10 cluster3-node1 systemd[5989]: kubelet.service: Failed at step EXEC spawning /usr/local/bin/kubelet: No such file or directory
Apr 30 22:03:10 cluster3-node1 systemd[1]: kubelet.service: Main process exited, code=exited, status=203/EXEC
Apr 30 22:03:10 cluster3-node1 systemd[1]: kubelet.service: Failed with result 'exit-code'.

We see it's trying to execute /usr/local/bin/kubelet with some parameters defined in its service config file. A good way to find errors and get more logs is to run the command manually (usually also with its parameters).

➜ root@cluster3-node1:~# /usr/local/bin/kubelet
-bash: /usr/local/bin/kubelet: No such file or directory

➜ root@cluster3-node1:~# whereis kubelet
kubelet: /usr/bin/kubelet

Another way would be to see the extended logging of a service like using journalctl -u kubelet.

Well, there we have it, wrong path specified. Correct the path in file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf and run:

vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf # fix

systemctl daemon-reload && systemctl restart kubelet

systemctl status kubelet  # should now show running

Also the node should be available for the api server, give it a bit of time though:

➜ k get node
NAME                     STATUS   ROLES           AGE   VERSION
cluster3-controlplane1   Ready    control-plane   14d   v1.28.2
cluster3-node1           Ready    <none>          14d   v1.28.2

Finally we write the reason into the file:

# /opt/course/18/reason.txt
wrong path to kubelet binary specified in service config

47. Create Secret and mount into Pod

Use context: kubectl config use-context k8s-c3-CCC

Do the following in a new Namespace secret. Create a Pod named secret-pod of image busybox:1.31.1 which should keep running for some time.

There is an existing Secret located at /opt/course/19/secret1.yaml, create it in the Namespace secret and mount it readonly into the Pod at /tmp/secret1.

Create a new Secret in Namespace secret called secret2 which should contain user=user1 and pass=1234. These entries should be available inside the Pod's container as environment variables APP_USER and APP_PASS.

Confirm everything is working.

First we create the Namespace and the requested Secrets in it:

k create ns secret

cp /opt/course/19/secret1.yaml 19_secret1.yaml

vim 19_secret1.yaml

We need to adjust the Namespace for that Secret:

# 19_secret1.yaml
apiVersion: v1
data:
  halt: IyEgL2Jpbi9zaAo...
kind: Secret
metadata:
  creationTimestamp: null
  name: secret1
  namespace: secret           # change

k -f 19_secret1.yaml create

Next we create the second Secret:

k -n secret create secret generic secret2 --from-literal=user=user1 --from-literal=pass=1234

Now we create the Pod template:

k -n secret run secret-pod --image=busybox:1.31.1 $do -- sh -c "sleep 5d" > 19.yamlvim 19.yaml

Then make the necessary changes:

# 19.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: secret-pod
  name: secret-pod
  namespace: secret                       # add
spec:
  containers:
  - args:
    - sh
    - -c
    - sleep 1d
    image: busybox:1.31.1
    name: secret-pod
    resources: {}
    env:                                  # add
    - name: APP_USER                      # add
      valueFrom:                          # add
        secretKeyRef:                     # add
          name: secret2                   # add
          key: user                       # add
    - name: APP_PASS                      # add
      valueFrom:                          # add
        secretKeyRef:                     # add
          name: secret2                   # add
          key: pass                       # add
    volumeMounts:                         # add
    - name: secret1                       # add
      mountPath: /tmp/secret1             # add
      readOnly: true                      # add
  dnsPolicy: ClusterFirst
  restartPolicy: Always
  volumes:                                # add
  - name: secret1                         # add
    secret:                               # add
      secretName: secret1                 # add
status: {}

It might not be necessary in current K8s versions to specify the readOnly: true because it's the default setting anyways.

And execute:

k -f 19.yaml create

Finally we check if all is correct:

➜ k -n secret exec secret-pod -- env | grep APP
APP_PASS=1234
APP_USER=user1

➜ k -n secret exec secret-pod -- find /tmp/secret1
/tmp/secret1
/tmp/secret1/..data
/tmp/secret1/halt
/tmp/secret1/..2019_12_08_12_15_39.463036797
/tmp/secret1/..2019_12_08_12_15_39.463036797/halt

➜ k -n secret exec secret-pod -- cat /tmp/secret1/halt
#! /bin/sh
### BEGIN INIT INFO
# Provides:          halt
# Required-Start:
# Required-Stop:
# Default-Start:
# Default-Stop:      0
# Short-Description: Execute the halt command.
# Description:
...

48. Update Kubernetes Version and join cluster

Use context: kubectl config use-context k8s-c3-CCC

Your coworker said node cluster3-node2 is running an older Kubernetes version and is not even part of the cluster. Update Kubernetes on that node to the exact version that's running on cluster3-controlplane1. Then add this node to the cluster. Use kubeadm for this.

Upgrade Kubernetes to cluster3-controlplane1 version

Search in the docs for kubeadm upgrade: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade

➜ k get node
NAME                     STATUS   ROLES           AGE   VERSION
cluster3-controlplane1   Ready    control-plane   22h   v1.28.2
cluster3-node1           Ready    <none>          22h   v1.28.2

Controlplane node seems to be running Kubernetes 1.28.2. Node cluster3-node2 might not yet be part of the cluster depending on previous tasks.

➜ ssh cluster3-node2

➜ root@cluster3-node2:~# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4", GitCommit:"fa3d7990104d7c1f16943a67f11b154b71f6a132", GitTreeState:"clean", BuildDate:"2023-07-19T12:19:40Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"linux/amd64"}

➜ root@cluster3-node2:~# kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.27.4
Kustomize Version: v5.0.1
The connection to the server localhost:8080 was refused - did you specify the right host or port?

➜ root@cluster3-node2:~# kubelet --version
Kubernetes v1.27.4

Here kubeadm is already installed in the wanted version, so we don't need to install it. Hence we can run:

➜ root@cluster3-node2:~# kubeadm upgrade nodecouldn't create a Kubernetes client from file "/etc/kubernetes/kubelet.conf": failed to load admin kubeconfig: open /etc/kubernetes/kubelet.conf: no such file or directoryTo see the stack trace of this error execute with --v=5 or higher

This is usually the proper command to upgrade a node. But this error means that this node was never even initialised, so nothing to update here. This will be done later using kubeadm join. For now we can continue with kubelet and kubectl:

➜ root@cluster3-node2:~# apt update
Hit:1 http://ppa.launchpad.net/rmescandon/yq/ubuntu focal InRelease
Get:2 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]                        
Hit:4 http://us.archive.ubuntu.com/ubuntu focal InRelease                                         
Get:3 https://packages.cloud.google.com/apt kubernetes-xenial InRelease [8,993 B]
Get:5 http://us.archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Get:6 http://us.archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Get:7 http://us.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [2,851 kB]
Get:8 http://us.archive.ubuntu.com/ubuntu focal-updates/main i386 Packages [884 kB]
Get:9 http://us.archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1,117 kB]
Get:10 http://us.archive.ubuntu.com/ubuntu focal-updates/universe i386 Packages [748 kB]
Fetched 5,946 kB in 3s (2,063 kB/s)                       
Reading package lists... Done
Building dependency tree       
Reading state information... Done
217 packages can be upgraded. Run 'apt list --upgradable' to see them.

➜ root@cluster3-node2:~# apt show kubectl -a | grep 1.28
...
Version: 1.28.2-00
Version: 1.28.1-00
Version: 1.28.0-00

➜ root@cluster3-node2:~# apt install kubectl=1.28.2-00 kubelet=1.28.2-00
...
Fetched 29.1 MB in 4s (7,547 kB/s)  
(Reading database ... 112527 files and directories currently installed.)
Preparing to unpack .../kubectl_1.28.2-00_amd64.deb ...
Unpacking kubectl (1.28.2-00) over (1.27.4-00) ...
dpkg: warning: downgrading kubelet from 1.27.4-00 to 1.28.2-00
Preparing to unpack .../kubelet_1.28.2-00_amd64.deb ...
Unpacking kubelet (1.28.2-00) over (1.27.4-00) ...
Setting up kubectl (1.28.2-00) ...
Setting up kubelet (1.28.2-00) ...

➜ root@cluster3-node2:~# kubelet --version
Kubernetes v1.28.2

Now we're up to date with kubeadm, kubectl and kubelet. Restart the kubelet:

➜ root@cluster3-node2:~# service kubelet restart

➜ root@cluster3-node2:~# service kubelet status
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: activating (auto-restart) (Result: exit-code) since Fri 2023-09-22 14:37:37 UTC; 2s a>
       Docs: https://kubernetes.io/docs/home/
    Process: 34331 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBEL>
   Main PID: 34331 (code=exited, status=1/FAILURE)

Sep 22 14:37:37 cluster3-node2 systemd[1]: kubelet.service: Main process exited, code=exited, stat>
Sep 22 14:37:37 cluster3-node2 systemd[1]: kubelet.service: Failed with result 'exit-code'.

These errors occur because we still need to run kubeadm join to join the node into the cluster. Let's do this in the next step.

Add cluster3-node2 to cluster

First we log into the controlplane1 and generate a new TLS bootstrap token, also printing out the join command:

➜ ssh cluster3-controlplane1

➜ root@cluster3-controlplane1:~# kubeadm token create --print-join-command
kubeadm join 192.168.100.31:6443 --token lyl4o0.vbkmv9rdph5qd660 --discovery-token-ca-cert-hash sha256:b0c94ccf935e27306ff24bce4b8f611c621509e80075105b3f25d296a94927ce 

➜ root@cluster3-controlplane1:~# kubeadm token list
TOKEN                     TTL         EXPIRES                ...
lyl4o0.vbkmv9rdph5qd660   23h         2023-09-23T14:38:12Z   ...
n4dkqj.hu52l46jfo4he61e   <forever>   <never>                ...
s7cmex.ty1olulkuljju9am   18h         2023-09-23T09:34:20Z   ...-

We see the expiration of 23h for our token, we could adjust this by passing the ttl argument.

Next we connect again to cluster3-node2 and simply execute the join command:

➜ ssh cluster3-node2

➜ root@cluster3-node2:~# kubeadm join 192.168.100.31:6443 --token lyl4o0.vbkmv9rdph5qd660 --discovery-token-ca-cert-hash sha256:b0c94ccf935e27306ff24bce4b8f611c621509e80075105b3f25d296a94927ce 

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0922 14:39:56.786605   34648 configset.go:177] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeproxy.config.k8s.io", Version:"v1alpha1", Kind:"KubeProxyConfiguration"}: strict decoding error: unknown field "logging"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

➜ root@cluster3-node2:~# service kubelet status
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: active (running) since Fri 2023-09-22 14:39:57 UTC; 14s ago
       Docs: https://kubernetes.io/docs/home/
   Main PID: 34695 (kubelet)
      Tasks: 12 (limit: 462)
     Memory: 55.4M
...

If you have troubles with kubeadm join you might need to run kubeadm reset.

This looks great though for us. Finally we head back to the main terminal and check the node status:

➜ k get node
NAME                     STATUS     ROLES           AGE    VERSION
cluster3-controlplane1   Ready      control-plane   102m   v1.28.2
cluster3-node1           Ready      <none>          97m    v1.28.2
cluster3-node2           NotReady   <none>          108s   v1.28.2

Give it a bit of time till the node is ready.

➜ k get node
NAME                     STATUS     ROLES           AGE    VERSION
cluster3-controlplane1   Ready      control-plane   102m   v1.28.2
cluster3-node1           Ready      <none>          97m    v1.28.2
cluster3-node2           Ready      <none>          108s   v1.28.2

We see cluster3-node2 is now available and up to date.

49. Create a Static Pod and Service

Use context: kubectl config use-context k8s-c3-CCC

Create a Static Pod named my-static-pod in Namespace default on cluster3-controlplane1. It should be of image nginx:1.16-alpine and have resource requests for 10m CPU and 20Mi memory.

Then create a NodePort Service named static-pod-service which exposes that static Pod on port 80 and check if it has Endpoints and if it's reachable through the cluster3-controlplane1 internal IP address. You can connect to the internal node IPs from your main terminal.

➜ ssh cluster3-controlplane1
➜ root@cluster1-controlplane1:~# cd /etc/kubernetes/manifests/
➜ root@cluster1-controlplane1:~# kubectl run my-static-pod \    
--image=nginx:1.16-alpine \    
-o yaml --dry-run=client > my-static-pod.yaml

Then edit the my-static-pod.yaml to add the requested resource requests:

# /etc/kubernetes/manifests/my-static-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: my-static-pod
  name: my-static-pod
spec:
  containers:
  - image: nginx:1.16-alpine
    name: my-static-pod
    resources:
      requests:
        cpu: 10m
        memory: 20Mi
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

And make sure it's running:

➜ k get pod -A | grep my-static
NAMESPACE     NAME                                   READY   STATUS   ...   AGEdefault       my-static-pod-cluster3-controlplane1   1/1     Running  ...   22s

Now we expose that static Pod:

k expose pod my-static-pod-cluster3-controlplane1 \  --name static-pod-service \  --type=NodePort \  --port 80

This would generate a Service like:

# kubectl expose pod my-static-pod-cluster3-controlplane1 --name static-pod-service --type=NodePort --port 80
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: null
  labels:
    run: my-static-pod
  name: static-pod-service
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    run: my-static-pod
  type: NodePort
status:
  loadBalancer: {}

Then run and test:

➜ k get svc,ep -l run=my-static-pod
NAME                         TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/static-pod-service   NodePort   10.99.168.252   <none>        80:30352/TCP   30s

NAME                           ENDPOINTS      AGE
endpoints/static-pod-service   10.32.0.4:80   30sec

Looking good

50. Check how long certificates are valid

Use context: kubectl config use-context k8s-c2-AC

Check how long the kube-apiserver server certificate is valid on cluster2-controlplane1. Do this with openssl or cfssl. Write the exipiration date into /opt/course/22/expiration.

Also run the correct kubeadm command to list the expiration dates and confirm both methods show the same date.

Write the correct kubeadm command that would renew the apiserver server certificate into /opt/course/22/kubeadm-renew-certs.sh.

First let's find that certificate:

➜ ssh cluster2-controlplane1

➜ root@cluster2-controlplane1:~# find /etc/kubernetes/pki | grep apiserver
/etc/kubernetes/pki/apiserver.crt
/etc/kubernetes/pki/apiserver-etcd-client.crt
/etc/kubernetes/pki/apiserver-etcd-client.key
/etc/kubernetes/pki/apiserver-kubelet-client.crt
/etc/kubernetes/pki/apiserver.key
/etc/kubernetes/pki/apiserver-kubelet-client.key

Next we use openssl to find out the expiration date:

➜ root@cluster2-controlplane1:~# openssl x509  -noout -text -in /etc/kubernetes/pki/apiserver.crt | grep Validity -A2
        Validity
            Not Before: Dec 20 18:05:20 2022 GMT
            Not After : Dec 20 18:05:20 2023 GMT

There we have it, so we write it in the required location on our main terminal:

# /opt/course/22/expiration
Dec 20 18:05:20 2023 GMT

And we use the feature from kubeadm to get the expiration too:

➜ root@cluster2-controlplane1:~# kubeadm certs check-expiration | grep apiserver
apiserver                Jan 14, 2022 18:49 UTC   363d        ca               no      
apiserver-etcd-client    Jan 14, 2022 18:49 UTC   363d        etcd-ca          no      
apiserver-kubelet-client Jan 14, 2022 18:49 UTC   363d        ca               no

Looking good. And finally we write the command that would renew all certificates into the requested location:

# /opt/course/22/kubeadm-renew-certs.shkubeadm certs renew apiserver51

51. Kubelet client/server cert info

Use context: kubectl config use-context k8s-c2-AC

Node cluster2-node1 has been added to the cluster using kubeadm and TLS bootstrapping.

Find the "Issuer" and "Extended Key Usage" values of the cluster2-node1:

kubelet client certificate, the one used for outgoing connections to the kube-apiserver.
kubelet server certificate, the one used for incoming connections from the kube-apiserver.

Write the information into file /opt/course/23/certificate-info.txt.

Compare the "Issuer" and "Extended Key Usage" fields of both certificates and make sense of these.

To find the correct kubelet certificate directory, we can look for the default value of the --cert-dir parameter for the kubelet. For this search for "kubelet" in the Kubernetes docs which will lead to: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet. We can check if another certificate directory has been configured using ps aux or in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf.

First we check the kubelet client certificate:

➜ ssh cluster2-node1

➜ root@cluster2-node1:~# openssl x509  -noout -text -in /var/lib/kubelet/pki/kubelet-client-current.pem | grep Issuer
        Issuer: CN = kubernetes
        
➜ root@cluster2-node1:~# openssl x509  -noout -text -in /var/lib/kubelet/pki/kubelet-client-current.pem | grep "Extended Key Usage" -A1
            X509v3 Extended Key Usage: 
                TLS Web Client Authentication

Next we check the kubelet server certificate:

➜ root@cluster2-node1:~# openssl x509  -noout -text -in /var/lib/kubelet/pki/kubelet.crt | grep Issuer
          Issuer: CN = cluster2-node1-ca@1588186506

➜ root@cluster2-node1:~# openssl x509  -noout -text -in /var/lib/kubelet/pki/kubelet.crt | grep "Extended Key Usage" -A1
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication

We see that the server certificate was generated on the worker node itself and the client certificate was issued by the Kubernetes api. The "Extended Key Usage" also shows if it's for client or server authentication.

More about this: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping

52. NetworkPolicy

Use context: kubectl config use-context k8s-c1-H

There was a security incident where an intruder was able to access the whole cluster from a single hacked backend Pod.

To prevent this create a NetworkPolicy called np-backend in Namespace project-snake. It should allow the backend-* Pods only to:

connect to db1-* Pods on port 1111
connect to db2-* Pods on port 2222

Use the app label of Pods in your policy.

After implementation, connections from backend-* Pods to vault-* Pods on port 3333 should for example no longer work.

First we look at the existing Pods and their labels:

➜ k -n project-snake get pod
NAME        READY   STATUS    RESTARTS   AGE
backend-0   1/1     Running   0          8s
db1-0       1/1     Running   0          8s
db2-0       1/1     Running   0          10s
vault-0     1/1     Running   0          10s

➜ k -n project-snake get pod -L app
NAME        READY   STATUS    RESTARTS   AGE     APP
backend-0   1/1     Running   0          3m15s   backend
db1-0       1/1     Running   0          3m15s   db1
db2-0       1/1     Running   0          3m17s   db2
vault-0     1/1     Running   0          3m17s   vault

We test the current connection situation and see nothing is restricted:

➜ k -n project-snake get pod -o wide
NAME        READY   STATUS    RESTARTS   AGE     IP          ...
backend-0   1/1     Running   0          4m14s   10.44.0.24  ...
db1-0       1/1     Running   0          4m14s   10.44.0.25  ...
db2-0       1/1     Running   0          4m16s   10.44.0.23  ...
vault-0     1/1     Running   0          4m16s   10.44.0.22  ...

➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.25:1111
database one

➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.23:2222
database two

➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.22:3333
vault secret storage

Now we create the NP by copying and chaning an example from the k8s docs:

vim 24_np.yaml

# 24_np.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: np-backend
  namespace: project-snake
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
    - Egress                    # policy is only about Egress
  egress:
    -                           # first rule
      to:                           # first condition "to"
      - podSelector:
          matchLabels:
            app: db1
      ports:                        # second condition "port"
      - protocol: TCP
        port: 1111
    -                           # second rule
      to:                           # first condition "to"
      - podSelector:
          matchLabels:
            app: db2
      ports:                        # second condition "port"
      - protocol: TCP
        port: 2222

The NP above has two rules with two conditions each, it can be read as:

allow outgoing traffic if:
  (destination pod has label app=db1 AND port is 1111)
  OR
  (destination pod has label app=db2 AND port is 2222)

Wrong example

Now let's shortly look at a wrong example:

# WRONG
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: np-backend
  namespace: project-snake
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
    - Egress
  egress:
    -                           # first rule
      to:                           # first condition "to"
      - podSelector:                    # first "to" possibility
          matchLabels:
            app: db1
      - podSelector:                    # second "to" possibility
          matchLabels:
            app: db2
      ports:                        # second condition "ports"
      - protocol: TCP                   # first "ports" possibility
        port: 1111
      - protocol: TCP                   # second "ports" possibility
        port: 2222

The NP above has one rule with two conditions and two condition-entries each, it can be read as:

allow outgoing traffic if:  (destination pod has label app=db1 OR destination pod has label app=db2)  AND  (destination port is 1111 OR destination port is 2222)

Using this NP it would still be possible for backend-* Pods to connect to db2-* Pods on port 1111 for example which should be forbidden.

Create NetworkPolicy

We create the correct NP:

k -f 24_np.yaml create

And test again:

➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.25:1111database one
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.23:2222database two
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.22:3333
^C

Also helpful to use kubectl describe on the NP to see how k8s has interpreted the policy.

Great, looking more secure. Task done.

53. Etcd Snapshot Save and Restore

Use context: kubectl config use-context k8s-c3-CCC

Make a backup of etcd running on cluster3-controlplane1 and save it on the controlplane node at /tmp/etcd-backup.db.

Then create any kind of Pod in the cluster.

Finally restore the backup, confirm the cluster is still working and that the created Pod is no longer with us.

Etcd Backup

First we log into the controlplane and try to create a snapshop of etcd:

➜ ssh cluster3-controlplane1
➜ root@cluster3-controlplane1:~# ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db
Error:  rpc error: code = Unavailable desc = transport is closing

But it fails because we need to authenticate ourselves. For the necessary information we can check the etc manifest:

➜ root@cluster3-controlplane1:~# vim /etc/kubernetes/manifests/etcd.yaml

We only check the etcd.yaml for necessary information we don't change it.

# /etc/kubernetes/manifests/etcd.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://192.168.100.31:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt                           # use
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --initial-advertise-peer-urls=https://192.168.100.31:2380
    - --initial-cluster=cluster3-controlplane1=https://192.168.100.31:2380
    - --key-file=/etc/kubernetes/pki/etcd/server.key                            # use
    - --listen-client-urls=https://127.0.0.1:2379,https://192.168.100.31:2379   # use
    - --listen-metrics-urls=http://127.0.0.1:2381
    - --listen-peer-urls=https://192.168.100.31:2380
    - --name=cluster3-controlplane1
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-client-cert-auth=true
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt                    # use
    - --snapshot-count=10000
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    image: k8s.gcr.io/etcd:3.3.15-0
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /health
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: etcd
    resources: {}
    volumeMounts:
    - mountPath: /var/lib/etcd
      name: etcd-data
    - mountPath: /etc/kubernetes/pki/etcd
      name: etcd-certs
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /etc/kubernetes/pki/etcd
      type: DirectoryOrCreate
    name: etcd-certs
  - hostPath:
      path: /var/lib/etcd                                                     # important
      type: DirectoryOrCreate
    name: etcd-data
status: {}

But we also know that the api-server is connecting to etcd, so we can check how its manifest is configured:

➜ root@cluster3-controlplane1:~# cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep etcd
    - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
    - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
    - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
    - --etcd-servers=https://127.0.0.1:2379

We use the authentication information and pass it to etcdctl:

➜ root@cluster3-controlplane1:~# ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/server.crt \
--key /etc/kubernetes/pki/etcd/server.key

Snapshot saved at /tmp/etcd-backup.db

Etcd restore

Now create a Pod in the cluster and wait for it to be running:

➜ root@cluster3-controlplane1:~# kubectl run test --image=nginxpod/test created
➜ root@cluster3-controlplane1:~# kubectl get pod -l run=test -wNAME   READY   STATUS    RESTARTS   AGEtest   1/1     Running   0          60s

Next we stop all controlplane components:

root@cluster3-controlplane1:~# cd /etc/kubernetes/manifests/

root@cluster3-controlplane1:/etc/kubernetes/manifests# mv * ..

root@cluster3-controlplane1:/etc/kubernetes/manifests# watch crictl ps

Now we restore the snapshot into a specific directory:

➜ root@cluster3-controlplane1:~# ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-backup.db \
--data-dir /var/lib/etcd-backup \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/server.crt \
--key /etc/kubernetes/pki/etcd/server.key

2020-09-04 16:50:19.650804 I | mvcc: restore compact to 9935
2020-09-04 16:50:19.659095 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32

We could specify another host to make the backup from by using etcdctl --endpoints http://IP, but here we just use the default value which is: http://127.0.0.1:2379,http://127.0.0.1:4001.

The restored files are located at the new folder /var/lib/etcd-backup, now we have to tell etcd to use that directory:

➜ root@cluster3-controlplane1:~# vim /etc/kubernetes/etcd.yaml

# /etc/kubernetes/etcd.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
...
    - mountPath: /etc/kubernetes/pki/etcd
      name: etcd-certs
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /etc/kubernetes/pki/etcd
      type: DirectoryOrCreate
    name: etcd-certs
  - hostPath:
      path: /var/lib/etcd-backup                # change
      type: DirectoryOrCreate
    name: etcd-data
status: {}

Now we move all controlplane yaml again into the manifest directory. Give it some time (up to several minutes) for etcd to restart and for the api-server to be reachable again:

root@cluster3-controlplane1:/etc/kubernetes/manifests# mv ../*.yaml .root@cluster3-controlplane1:/etc/kubernetes/manifests# watch crictl ps

Then we check again for the Pod:

➜ root@cluster3-controlplane1:~# kubectl get pod -l run=testNo resources found in default namespace.

Awesome, backup and restore worked as our pod is gone.

54. Find Pods first to be terminated

Use context: kubectl config use-context k8s-c1-H

Check all available Pods in the Namespace project-c13 and find the names of those that would probably be terminated first if the nodes run out of resources (cpu or memory) to schedule all Pods. Write the Pod names into /opt/course/e1/pods-not-stable.txt.

When available cpu or memory resources on the nodes reach their limit, Kubernetes will look for Pods that are using more resources than they requested. These will be the first candidates for termination. If some Pods containers have no resource requests/limits set, then by default those are considered to use more than requested.

Kubernetes assigns Quality of Service classes to Pods based on the defined resources and limits, read more here: https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod

Hence we should look for Pods without resource requests defined, we can do this with a manual approach:

k -n project-c13 describe pod | less -p Requests # describe all pods and highlight Requests

Or we do:

k -n project-c13 describe pod | egrep "^(Name:|    Requests:)" -A1

We see that the Pods of Deployment c13-3cc-runner-heavy don't have any resources requests specified. Hence our answer would be:

# /opt/course/e1/pods-not-stable.txtc13-3cc-runner-heavy-65588d7d6-djtv9mapc13-3cc-runner-heavy-65588d7d6-v8kf5mapc13-3cc-runner-heavy-65588d7d6-wwpb4mapo3db-0o3db-1 # maybe not existing if already removed via previous scenario

To automate this process you could use jsonpath like this:

➜ k -n project-c13 get pod \
  -o jsonpath="{range .items[*]} {.metadata.name}{.spec.containers[*].resources}{'\n'}"

 c13-2x3-api-86784557bd-cgs8gmap[requests:map[cpu:50m memory:20Mi]]
 c13-2x3-api-86784557bd-lnxvjmap[requests:map[cpu:50m memory:20Mi]]
 c13-2x3-api-86784557bd-mnp77map[requests:map[cpu:50m memory:20Mi]]
 c13-2x3-web-769c989898-6hbgtmap[requests:map[cpu:50m memory:10Mi]]
 c13-2x3-web-769c989898-g57nqmap[requests:map[cpu:50m memory:10Mi]]
 c13-2x3-web-769c989898-hfd5vmap[requests:map[cpu:50m memory:10Mi]]
 c13-2x3-web-769c989898-jfx64map[requests:map[cpu:50m memory:10Mi]]
 c13-2x3-web-769c989898-r89mgmap[requests:map[cpu:50m memory:10Mi]]
 c13-2x3-web-769c989898-wtgxlmap[requests:map[cpu:50m memory:10Mi]]
 c13-3cc-runner-98c8b5469-dzqhrmap[requests:map[cpu:30m memory:10Mi]]
 c13-3cc-runner-98c8b5469-hbtdvmap[requests:map[cpu:30m memory:10Mi]]
 c13-3cc-runner-98c8b5469-n9lswmap[requests:map[cpu:30m memory:10Mi]]
 c13-3cc-runner-heavy-65588d7d6-djtv9map[]
 c13-3cc-runner-heavy-65588d7d6-v8kf5map[]
 c13-3cc-runner-heavy-65588d7d6-wwpb4map[]
 c13-3cc-web-675456bcd-glpq6map[requests:map[cpu:50m memory:10Mi]]
 c13-3cc-web-675456bcd-knlpxmap[requests:map[cpu:50m memory:10Mi]]
 c13-3cc-web-675456bcd-nfhp9map[requests:map[cpu:50m memory:10Mi]]
 c13-3cc-web-675456bcd-twn7mmap[requests:map[cpu:50m memory:10Mi]]
 o3db-0{}
 o3db-1{}

This lists all Pod names and their requests/limits, hence we see the three Pods without those defined.

Or we look for the Quality of Service classes:

➜ k get pods -n project-c13 \
  -o jsonpath="{range .items[*]}{.metadata.name} {.status.qosClass}{'\n'}"

c13-2x3-api-86784557bd-cgs8g Burstable
c13-2x3-api-86784557bd-lnxvj Burstable
c13-2x3-api-86784557bd-mnp77 Burstable
c13-2x3-web-769c989898-6hbgt Burstable
c13-2x3-web-769c989898-g57nq Burstable
c13-2x3-web-769c989898-hfd5v Burstable
c13-2x3-web-769c989898-jfx64 Burstable
c13-2x3-web-769c989898-r89mg Burstable
c13-2x3-web-769c989898-wtgxl Burstable
c13-3cc-runner-98c8b5469-dzqhr Burstable
c13-3cc-runner-98c8b5469-hbtdv Burstable
c13-3cc-runner-98c8b5469-n9lsw Burstable
c13-3cc-runner-heavy-65588d7d6-djtv9 BestEffort
c13-3cc-runner-heavy-65588d7d6-v8kf5 BestEffort
c13-3cc-runner-heavy-65588d7d6-wwpb4 BestEffort
c13-3cc-web-675456bcd-glpq6 Burstable
c13-3cc-web-675456bcd-knlpx Burstable
c13-3cc-web-675456bcd-nfhp9 Burstable
c13-3cc-web-675456bcd-twn7m Burstable
o3db-0 BestEffort
o3db-1 BestEffort

Here we see three with BestEffort, which Pods get that don't have any memory or cpu limits or requests defined.

A good practice is to always set resource requests and limits. If you don't know the values your containers should have you can find this out using metric tools like Prometheus. You can also use kubectl top pod or even kubectl exec into the container and use top and similar tools.

55. Curl Manually Contact API

Use context: kubectl config use-context k8s-c1-H

There is an existing ServiceAccount secret-reader in Namespace project-hamster. Create a Pod of image curlimages/curl:7.65.3 named tmp-api-contact which uses this ServiceAccount. Make sure the container keeps running.

Exec into the Pod and use curl to access the Kubernetes Api of that cluster manually, listing all available secrets. You can ignore insecure https connection. Write the command(s) for this into file /opt/course/e4/list-secrets.sh.

https://kubernetes.io/docs/tasks/run-application/access-api-from-pod

It's important to understand how the Kubernetes API works. For this it helps connecting to the api manually, for example using curl. You can find information fast by search in the Kubernetes docs for "curl api" for example

First we create our Pod:

k run tmp-api-contact \  
--image=curlimages/curl:7.65.3 $do \  
--command > e2.yaml -- sh -c 'sleep 1d'

vim e2.yaml

Add the service account name and Namespace:

# e2.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: tmp-api-contact
  name: tmp-api-contact
  namespace: project-hamster          # add
spec:
  serviceAccountName: secret-reader   # add
  containers:
  - command:
    - sh
    - -c
    - sleep 1d
    image: curlimages/curl:7.65.3
    name: tmp-api-contact
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

Then run and exec into:

k -f 6.yaml create
k -n project-hamster exec tmp-api-contact -it -- sh

Once on the container we can try to connect to the api using curl, the api is usually available via the Service named kubernetes in Namespace default (You should know how dns resolution works across Namespaces.). Else we can find the endpoint IP via environment variables running env.

So now we can do:

curl https://kubernetes.defaultcurl -k https://kubernetes.default # ignore insecure as allowed in ticket descriptioncurl -k https://kubernetes.default/api/v1/secrets # should show Forbidden 403

The last command shows 403 forbidden, this is because we are not passing any authorisation information with us. The Kubernetes Api Server thinks we are connecting as system:anonymous. We want to change this and connect using the Pods ServiceAccount named secret-reader.

We find the the token in the mounted folder at /var/run/secrets/kubernetes.io/serviceaccount, so we do:

➜ TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
➜ curl -k https://kubernetes.default/api/v1/secrets -H "Authorization: Bearer ${TOKEN}"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0{
  "kind": "SecretList",
  "apiVersion": "v1",
  "metadata": {
    "selfLink": "/api/v1/secrets",
    "resourceVersion": "10697"
  },
  "items": [
    {
      "metadata": {
        "name": "default-token-5zjbd",
        "namespace": "default",
        "selfLink": "/api/v1/namespaces/default/secrets/default-token-5zjbd",
        "uid": "315dbfd9-d235-482b-8bfc-c6167e7c1461",
        "resourceVersion": "342",
...

Now we're able to list all Secrets, registering as the ServiceAccount secret-reader under which our Pod is running.

To use encrypted https connection we can run:

CACERT=/var/run/secrets/kubernetes.io/serviceaccount/ca.crtcurl --cacert ${CACERT} https://kubernetes.default/api/v1/secrets -H "Authorization: Bearer ${TOKEN}"

For troubleshooting we could also check if the ServiceAccount is actually able to list Secrets using:

➜ k auth can-i get secret --as system:serviceaccount:project-hamster:secret-readeryes

Finally write the commands into the requested location:

# /opt/course/e4/list-secrets.sh
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
curl -k https://kubernetes.default/api/v1/secrets -H "Authorization: Bearer ${TOKEN}"

56. ETCD information

Use context: kubectl config use-context k8s-c2-AC

The cluster admin asked you to find out the following information about etcd running on cluster2-controlplane1:

Server private key location
Server certificate expiration date
Is client certificate authentication enabled

Write these information into /opt/course/p1/etcd-info.txt

Finally you're asked to save an etcd snapshot at /etc/etcd-snapshot.db on cluster2-controlplane1 and display its status.

Let's check the nodes:

➜ k get node
NAME                     STATUS   ROLES           AGE    VERSION
cluster2-controlplane1   Ready    control-plane   89m   v1.28.2
cluster2-node1           Ready    <none>          87m   v1.28.2

➜ ssh cluster2-controlplane1

First we check how etcd is setup in this cluster:

➜ root@cluster2-controlplane1:~# kubectl -n kube-system get pod
NAME                                                READY   STATUS    RESTARTS   AGE
coredns-66bff467f8-k8f48                            1/1     Running   0          26h
coredns-66bff467f8-rn8tr                            1/1     Running   0          26h
etcd-cluster2-controlplane1                         1/1     Running   0          26h
kube-apiserver-cluster2-controlplane1               1/1     Running   0          26h
kube-controller-manager-cluster2-controlplane1      1/1     Running   0          26h
kube-proxy-qthfg                                    1/1     Running   0          25h
kube-proxy-z55lp                                    1/1     Running   0          26h
kube-scheduler-cluster2-controlplane1               1/1     Running   1          26h
weave-net-cqdvt                                     2/2     Running   0          26h
weave-net-dxzgh                                     2/2     Running   1          25h

We see it's running as a Pod, more specific a static Pod. So we check for the default kubelet directory for static manifests:

➜ root@cluster2-controlplane1:~# find /etc/kubernetes/manifests//etc/kubernetes/manifests//etc/kubernetes/manifests/kube-controller-manager.yaml/etc/kubernetes/manifests/kube-apiserver.yaml/etc/kubernetes/manifests/etcd.yaml/etc/kubernetes/manifests/kube-scheduler.yaml
➜ root@cluster2-controlplane1:~# vim /etc/kubernetes/manifests/etcd.yaml

So we look at the yaml and the parameters with which etcd is started:

# /etc/kubernetes/manifests/etcd.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://192.168.102.11:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt              # server certificate
    - --client-cert-auth=true                                      # enabled
    - --data-dir=/var/lib/etcd
    - --initial-advertise-peer-urls=https://192.168.102.11:2380
    - --initial-cluster=cluster2-controlplane1=https://192.168.102.11:2380
    - --key-file=/etc/kubernetes/pki/etcd/server.key               # server private key
    - --listen-client-urls=https://127.0.0.1:2379,https://192.168.102.11:2379
    - --listen-metrics-urls=http://127.0.0.1:2381
    - --listen-peer-urls=https://192.168.102.11:2380
    - --name=cluster2-controlplane1
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-client-cert-auth=true
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    - --snapshot-count=10000
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
...

We see that client authentication is enabled and also the requested path to the server private key, now let's find out the expiration of the server certificate:

➜ root@cluster2-controlplane1:~# openssl x509  -noout -text -in /etc/kubernetes/pki/etcd/server.crt | grep Validity -A2
        Validity
            Not Before: Sep 13 13:01:31 2021 GMT
            Not After : Sep 13 13:01:31 2022 GMT

There we have it. Let's write the information into the requested file:

# /opt/course/p1/etcd-info.txt
Server private key location: /etc/kubernetes/pki/etcd/server.key
Server certificate expiration date: Sep 13 13:01:31 2022 GMT
Is client certificate authentication enabled: yes

Create etcd snapshot

First we try:

ETCDCTL_API=3 etcdctl snapshot save /etc/etcd-snapshot.db

We get the endpoint also from the yaml. But we need to specify more parameters, all of which we can find the yaml declaration above:

ETCDCTL_API=3 etcdctl snapshot save /etc/etcd-snapshot.db \--cacert /etc/kubernetes/pki/etcd/ca.crt \--cert /etc/kubernetes/pki/etcd/server.crt \--key /etc/kubernetes/pki/etcd/server.key

This worked. Now we can output the status of the backup file:

➜ root@cluster2-controlplane1:~# ETCDCTL_API=3 etcdctl snapshot status /etc/etcd-snapshot.db4d4e953, 7213, 1291, 2.7 MB

The status shows:

Hash: 4d4e953
Revision: 7213
Total Keys: 1291
Total Size: 2.7 MB

57. Service

Use context: kubectl config use-context k8s-c1-H

You're asked to confirm that kube-proxy is running correctly on all nodes. For this perform the following in Namespace project-hamster:

Create a new Pod named p2-pod with two containers, one of image nginx:1.21.3-alpine and one of image busybox:1.31. Make sure the busybox container keeps running for some time.

Create a new Service named p2-service which exposes that Pod internally in the cluster on port 3000->80.

Find the kube-proxy container on all nodes cluster1-controlplane1, cluster1-node1 and cluster1-node2 and make sure that it's using iptables. Use command crictl for this.

Write the iptables rules of all nodes belonging the created Service p2-service into file /opt/course/p2/iptables.txt.

Finally delete the Service and confirm that the iptables rules are gone from all nodes.

Create the Pod

First we create the Pod:

# check out export statement on top which allows us to use $dok run p2-pod --image=nginx:1.21.3-alpine $do > p2.yamlvim p2.yaml

Next we add the requested second container:

# p2.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: p2-pod
  name: p2-pod
  namespace: project-hamster             # add
spec:
  containers:
  - image: nginx:1.21.3-alpine
    name: p2-pod
  - image: busybox:1.31                  # add
    name: c2                             # add
    command: ["sh", "-c", "sleep 1d"]    # add
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

And we create the Pod:

k -f p2.yaml create

Create the Service

Next we create the Service:

k -n project-hamster expose pod p2-pod --name p2-service --port 3000 --target-port 80

This will create a yaml like:

apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2020-04-30T20:58:14Z"
  labels:
    run: p2-pod
  managedFields:
...
    operation: Update
    time: "2020-04-30T20:58:14Z"
  name: p2-service
  namespace: project-hamster
  resourceVersion: "11071"
  selfLink: /api/v1/namespaces/project-hamster/services/p2-service
  uid: 2a1c0842-7fb6-4e94-8cdb-1602a3b1e7d2
spec:
  clusterIP: 10.97.45.18
  ports:
  - port: 3000
    protocol: TCP
    targetPort: 80
  selector:
    run: p2-pod
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

We should confirm Pods and Services are connected, hence the Service should have Endpoints.

k -n project-hamster get pod,svc,ep

Confirm kube-proxy is running and is using iptables

First we get nodes in the cluster:

➜ k get nodeNAME                     STATUS   ROLES           AGE   VERSIONcluster1-controlplane1   Ready    control-plane   98m   v1.28.2
cluster1-node1           Ready    <none>          96m   v1.28.2
cluster1-node2           Ready    <none>          95m   v1.28.2

The idea here is to log into every node, find the kube-proxy container and check its logs:

➜ ssh cluster1-controlplane1

➜ root@cluster1-controlplane1$ crictl ps | grep kube-proxy
27b6a18c0f89c       36c4ebbc9d979       3 hours ago         Running             kube-proxy

➜ root@cluster1-controlplane1~# crictl logs 27b6a18c0f89c
...
I0913 12:53:03.096620       1 server_others.go:212] Using iptables Proxier.
...

This should be repeated on every node and result in the same output Using iptables Proxier.

Check kube-proxy is creating iptables rules

Now we check the iptables rules on every node first manually:

➜ ssh cluster1-controlplane1 iptables-save | grep p2-service
-A KUBE-SEP-6U447UXLLQIKP7BB -s 10.44.0.20/32 -m comment --comment "project-hamster/p2-service:" -j KUBE-MARK-MASQ
-A KUBE-SEP-6U447UXLLQIKP7BB -p tcp -m comment --comment "project-hamster/p2-service:" -m tcp -j DNAT --to-destination 10.44.0.20:80
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.97.45.18/32 -p tcp -m comment --comment "project-hamster/p2-service: cluster IP" -m tcp --dport 3000 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.97.45.18/32 -p tcp -m comment --comment "project-hamster/p2-service: cluster IP" -m tcp --dport 3000 -j KUBE-SVC-2A6FNMCK6FDH7PJH
-A KUBE-SVC-2A6FNMCK6FDH7PJH -m comment --comment "project-hamster/p2-service:" -j KUBE-SEP-6U447UXLLQIKP7BB

➜ ssh cluster1-node1 iptables-save | grep p2-service
-A KUBE-SEP-6U447UXLLQIKP7BB -s 10.44.0.20/32 -m comment --comment "project-hamster/p2-service:" -j KUBE-MARK-MASQ
-A KUBE-SEP-6U447UXLLQIKP7BB -p tcp -m comment --comment "project-hamster/p2-service:" -m tcp -j DNAT --to-destination 10.44.0.20:80
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.97.45.18/32 -p tcp -m comment --comment "project-hamster/p2-service: cluster IP" -m tcp --dport 3000 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.97.45.18/32 -p tcp -m comment --comment "project-hamster/p2-service: cluster IP" -m tcp --dport 3000 -j KUBE-SVC-2A6FNMCK6FDH7PJH
-A KUBE-SVC-2A6FNMCK6FDH7PJH -m comment --comment "project-hamster/p2-service:" -j KUBE-SEP-6U447UXLLQIKP7BB

➜ ssh cluster1-node2 iptables-save | grep p2-service
-A KUBE-SEP-6U447UXLLQIKP7BB -s 10.44.0.20/32 -m comment --comment "project-hamster/p2-service:" -j KUBE-MARK-MASQ
-A KUBE-SEP-6U447UXLLQIKP7BB -p tcp -m comment --comment "project-hamster/p2-service:" -m tcp -j DNAT --to-destination 10.44.0.20:80
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.97.45.18/32 -p tcp -m comment --comment "project-hamster/p2-service: cluster IP" -m tcp --dport 3000 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.97.45.18/32 -p tcp -m comment --comment "project-hamster/p2-service: cluster IP" -m tcp --dport 3000 -j KUBE-SVC-2A6FNMCK6FDH7PJH
-A KUBE-SVC-2A6FNMCK6FDH7PJH -m comment --comment "project-hamster/p2-service:" -j KUBE-SEP-6U447UXLLQIKP7BB

Great. Now let's write these logs into the requested file:

➜ ssh cluster1-controlplane1 iptables-save | grep p2-service >> /opt/course/p2/iptables.txt
➜ ssh cluster1-node1 iptables-save | grep p2-service >> /opt/course/p2/iptables.txt
➜ ssh cluster1-node2 iptables-save | grep p2-service >> /opt/course/p2/iptables.txt

Delete the Service and confirm iptables rules are gone

Delete the Service:

k -n project-hamster delete svc p2-service

And confirm the iptables rules are gone:

➜ ssh cluster1-controlplane1 iptables-save | grep p2-service
➜ ssh cluster1-node1 iptables-save | grep p2-service➜ ssh cluster1-node2 iptables-save | grep p2-service

Done.

Kubernetes Services are implemented using iptables rules (with default config) on all nodes. Every time a Service has been altered, created, deleted or Endpoints of a Service have changed, the kube-apiserver contacts every node's kube-proxy to update the iptables rules according to the current state.

CKA 考試題目

1. Node list​

2. service​

3. deployment​

4. static pod​

5. namespace​

6. Nodeport​

7. 取得 NodeInfo​

8. Pod Scheduling 1​

9. Deployment Rolling Update​

10. Secret​

11. etcd backup​

12. Volume​

13. Security Context for a Pod​

14. PV & PVC​

14. Pod scheduling​

15. Init Pod​

16. JSON file​

17. ConfigMap​

18. Troubleshoot​

19. Troubleshooting​

20. Credentials​

21. user context​

22. role​

23. DNS​

24. ServiceAccount​

25. JSONPATH​

26. Environment Variables​

27. Security Context​

28. Contexts​

29. Service​

30. Schedule Pod on Controlplane Nodes​

31. Scale down StatefulSet​

32. Pod Ready if Service is reachable​

33. Kubectl sorting​

34. Storage, PV, PVC, Pod volume​

35. Node and Pod Resource Usage​

36. Get Controlplane Information​

37. Kill Scheduler, Manual Scheduling​

38. RBAC ServiceAccount Role RoleBinding​

39. DaemonSet on all Nodes​

40. Deployment on all Nodes​

41. Multi Containers and Pod shared Volume​

42. Find out Cluster Information​

43. Cluster Event Logging​

44. Namespaces and Api Resources​

45. Find Container of Pod and check info​

46. Fix Kubelet​

47. Create Secret and mount into Pod​

48. Update Kubernetes Version and join cluster​

49. Create a Static Pod and Service​

50. Check how long certificates are valid​

51. Kubelet client/server cert info​

52. NetworkPolicy​

53. Etcd Snapshot Save and Restore​

54. Find Pods first to be terminated​

55. Curl Manually Contact API​

56. ETCD information​

57. Service​

1. Node list

2. service

3. deployment

4. static pod

5. namespace

6. Nodeport

7. 取得 NodeInfo

8. Pod Scheduling 1

9. Deployment Rolling Update

10. Secret

11. etcd backup

12. Volume

13. Security Context for a Pod

14. PV & PVC

14. Pod scheduling

15. Init Pod

16. JSON file

17. ConfigMap

18. Troubleshoot

19. Troubleshooting

20. Credentials

21. user context

22. role

23. DNS

24. ServiceAccount

25. JSONPATH

26. Environment Variables

27. Security Context

28. Contexts

29. Service

30. Schedule Pod on Controlplane Nodes

31. Scale down StatefulSet

32. Pod Ready if Service is reachable

33. Kubectl sorting

34. Storage, PV, PVC, Pod volume

35. Node and Pod Resource Usage

36. Get Controlplane Information

37. Kill Scheduler, Manual Scheduling

38. RBAC ServiceAccount Role RoleBinding

39. DaemonSet on all Nodes

40. Deployment on all Nodes

41. Multi Containers and Pod shared Volume

42. Find out Cluster Information

43. Cluster Event Logging

44. Namespaces and Api Resources

45. Find Container of Pod and check info

46. Fix Kubelet

47. Create Secret and mount into Pod

48. Update Kubernetes Version and join cluster

49. Create a Static Pod and Service

50. Check how long certificates are valid

51. Kubelet client/server cert info

52. NetworkPolicy

53. Etcd Snapshot Save and Restore

54. Find Pods first to be terminated

55. Curl Manually Contact API

56. ETCD information

57. Service